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REMARKS 

Status of the Claims 

Claims 1, 3, 5-1 1, 13-15, 17 and 18 are currently pending in the application. Claims 1, 3- 
11, 13-15 and 17 stand rejected. Claims 5-10, 13 and 14 have been withdrawn in response to a 
prior restriction requirement. Claims 1 has been amended as set forth herein. Claims 2, 4, 12 
and 16 are cancelled. All cancellations and amendments are made without prejudice or 
disclaimer. No new matter has been added by way of the present amendments. Specifically, the 
amendment to claim 1 is supported by the specification at page 20, lines 10-15 (claim 1, part (e)), 
page 9, lines 8-12 (claim 1, part (e)), abstract (claim 1, part (f)), page 19, lines 14-20 and Figure 
5 (claim 1, part (f) and part (g)), page 18, line 16 to page 19, line 1 (claim 1, part (h), and new 
claim 18) and page 9, lines 21-23 (claim 1, part (i)). Reconsideration is respectfully requested. 

Interview 

Applicants and Applicants' representatives thank the Examiner for extending the courtesy 
of an Interview on March 21, 2006. The substance of the interview is substantially as described 
in the Interview Summary of record. Briefly, amendments to claim 1, presented herein, were 
discussed, directed at further differentiating the present invention over the prior art and further 
clarifying that which Applicants consider to be their invention, addressing the written description 
issues. It is believed that the amendments and information discussed at the interview fully 
address all of the remaining issues and a Notice of Allowance is earnestly solicited. 
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Information Disclosure Statement 

Applicants respectfully request the Examiner consider the references submitted in the 
Information Disclosure Statements of December 19, 2002 and April 10, 2003. If the references 
have already been considered, Applicants respectfully request the Examiner return to Applicants 
signed PTO-1449 forms indicating the references have been considered. 

Rejections Under 35 U.S.C § 112, First Paragraph 

Claims 1, 3, 4, 11, 15 and 17 stand rejected under 35 U.S.C. § 112, first paragraph, for 
failing to comply with the written description requirement. (See, Office Action of November 18, 
2005, at page 2, hereinafter referred to as "Office Action"). Claim 4 has been cancelled herein 
without prejudice or disclaimer, thus obviating the rejection as to claim 4. Applicants traverse 
the rejection as to the remaining claims as set forth herein. 

The Examiner states the claims "broadly state an aspartic enzyme that produces plasma 
protein fragments having an inhibitory activity of metastasis and growth of cancer," and that "the 
claimed product is not isolated, nor described by any particular structure." (Id, at page 3). The 
Examiner further states that a "simple statement of function for the claimed protein does not 
ful[fill] the written description requirement." (Id.). Although Applicants do not agree that claim 
1 lacks written description support in the specification, to expedite prosecution, claim 1 has been 
amended to recite the following additional features: (e) cleaves plasminogen at 73L-74F and/or 
451L-452P to produce fragments comprising Kringles 1 to 4 of plasminogen; (f) is an aspartic 
protease; (g) has an activity that is inhibited by an aspartic protease inhibitor; (h) is isolated from 
mammalian cells by binding to an affinity chromatography column comprising an aspartic 
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protease inhibitor as a ligand; and (i) is Plasminogen Angiostatin Converting Enzyme at pH 4 
(PACE4). Support for this amendment may be found in the as-filed specification at page 20, 
lines 10-15 (claim 1, part (e)), page 9, lines 8-12 (claim 1, part (e)), abstract (claim 1, part (f)), 
page 19, lines 14-20 and Figure 5 (claim 1, part (f) and part (g)), page 18, line 16 to page 19, line 
1 (claim 1, part (h)) and page 9, lines 21-23 (claim 1, part (i)). 

Claim 1 recites more than just a broad statement of an aspartic enzyme. The enzyme is 
defined functionally in claim 1 by precisely defining the enzyme's substrate and product. 
Furthermore, the name of the enzyme and its protease activity is recited in claim 1 , at parts (e), 
(f) and (i). Therefore, claim 1 is clearly supported by the as-filed specification and one of 
ordinary skill in the art, having considered the as-filed specification, w^ould know that Applicants 
w^ere in possession of the presently claimed invention. 

Reconsideration and withdrawal of the written description rejection of claims 1, 3, 4, 1 1, 
15 and 17 are respectfully requested. 

Rejections Under 35 U.S.C § 102(b) 

Claims 1, 3, 4, 11, 15 and 17 stand rejected under 35 U.S.C. § 102(b) as being anticipated 
by Gately et al., Cancer Res., 56:4887-4890, 1996 (hereinafter referred to as "Gately et al."). 
{See, Id,, at page 4). Claim 4 has been cancelled herein without prejudice or disclaimer, thus 
obviating the rejection as to claim 4. Applicants traverse the rejection as to the remaining claims 
as set forth herein. 

The Examiner states that "Gately' s factor derived from PC-3 would in fact contain the 
claimed enzyme," and that "while Gately seemingly did not perform the proteinase inhibitor 
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assay at the same pH as Applicants this does not teach away from the simple fact that [the] 
enzyme of Gately contained in the PC-3 supematent is the enzyme of Applicants ," {Id,, at pages 
4-5, emphasis added). However, in fact, it is biochemically impossible for the enzyme of Gately 
et al. to be the enzyme of the presently claimed invention. 

The Enzyme of the Present Invention 

As discussed at the interview, the present enzyme is an aspartic protease. {See, 
specification, as-filed, at abstract, page 19, lines 14-20 and Figure 5). Aspartic proteases are 
acidic proteases, meaning they belong to the family of proteases that operate at an acidic pH. 
{See, Rao et al., Microbiol Molecul Biol Rev., 62(3):597-635, 1998, at page 602-604, copy 
attached hereto as Exhibit A). The acidic proteases function most efficiently at an acidic pH. 
{Id. at page 602). Aspartic proteases, as exemplified by, for instance, pepsin, a very well-studied 
aspartic protease, catalyze peptide hydrolysis via active site aspartic acid residues. {Id. at page 
603-604 and especially Figure 3(A)). 

Aspartic proteases are inhibited with an exceptionally high specificity by pepstatin. {See, 
Dinu et al., Roum, Biotechnol Lett, 7(3):753-758, 2002, at abstract and pages 756 and 757, copy 
of reference attached hereto as Exhibit B, see also, Barrett et al., Biochem. J,, 127:439-441, 
1972, copy attached hereto as Exhibit C). Pepstatin is a very well-characterized molecule, 
having been studied for over 30 years. {Id,). It is well known that pepstatin only inhibits, in a 
very specific manner, aspartic proteases. {Id.). 

One of ordinary skill in the art would immediately conclude that the enzyme of the 
presently claimed invention can only be an aspartic protease based on the results of the data 
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disclosed in Figure 5 of the as-filed specification. Specifically, Figure 5 depicts the results of 
inhibition studies wherein the enzyme of the present invention was incubated with various types 
of protease enzyme inhibitors. Starting at the very top of the Figure 5, the lane having no 
enzyme, the negative control, exhibits no activity (no solid bar extending to the right). However, 
the enzyme alone, in the presence of no inhibitors, the positive control, exhibits activity as 
indicated by the solid bar extending to the right. When the enzyme of the present invention is 
incubated with various known and well-characterize protease inhibitors, only pepstatin 
specifically and totally inhibited the activity of the enzyme of the presently claimed invention. 
{See, Figure 5, lane marked "Pepstatin A", of the as-filed specification). Therefore, the enzyme 
of the presently claimed invention can only be an aspartic protease and cannot be any other kind 
of protease, such as, for instance, a serine protease. 

The Enzyme Disclosed by Gately et aL 

Serine proteases, in contrast, have entirely different enzymological mechanisms. {See, 
Rao et al., at page 603, and Stryer, "Biochemistry", 3'"^ Edition, W.H. Freeman & Co., N.Y., 
1988, pages 220-226, copy of pages attached hereto as Exhibit D). Serine proteases, like the 
protease disclosed in Gately et al., and like chymotrypsin, a very well studied member of the 
serine protease family, have an enzymatic mechanism that utilizes a triad of active site amino 
acid residues consisting of serine, histidine and aspartic acid. {See, Stryer, at page 224, Figure 9- 
41). The mechanism of serine proteases is believed to include a tetrahedral transition state, as 
depicted in Figure 9-44 of Stryer, at page 225. This is in marked contrast to the mechanism of 
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aspartic proteases which require the presence of two aspartic acid residues and one molecule of 
water in the active site to achieve peptide bond hydrolysis. 

The enzyme disclosed by Gately et al, is only a serine protease and cannot also be an 
aspartic protease. As shown in Table 1 of Gately et al., pepstatin has no effect on the activity of 
the Gately et al. enzyme. However, several serine protease-specific inhibitors completely inhibit 
the activity of the enzyme of Gately et al. {See, Gately et al., at Table 1 and paragraph bridging 
pages 4888-4889). Thus, one of ordinary skill in the art would know that the enzyme of Gately 
et al. can only be a serine protease, and would also know that the enzyme of Gately et al. cannot 
be an aspartic protease. 

The mechanisms of serine proteases and aspartic protease have been studied for many 
years. Chemical compounds that specifically inhibit a serine protease, but not any other types of 
proteases, have also been known for many decades. Likewise, chemical compounds, such as 
pepstatin, have been known for decades and are known to only inhibit aspartic proteases. The 
reason these chemical compounds are so exceptionally specific to inhibiting a particular kind of 
protease lies in the unique enzymatic catalytic mechanism of each family of protease. Thus, 
because the enzymatic mechanism of serine proteases is so different from that of aspartic 
proteases, an inhibitor of serine proteases is often not able to inhibit aspartic proteases. 

Therefore, it is enzymatically impossible for the enzyme of the presently claimed 
invention to be a serine protease or to be the protease disclosed by Gately et al. because, as 
described in Table 1 of Gately et al. and Figure 5 of the present specification, the enzyme of the 
presently claimed invention is specifically inhibited by pepstatin, whereas that of Gately et al. is 
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not. Thus, the enzyme disclosed by Gately et al. must be a different enzyme and the disclosure 
of Gately et al. cannot anticipate the presently claimed invention. 

Furthermore, the enzyme of the present invention is not only obtainable from PC-3 
culture supematants (human prostate cancer cells). The enzyme of the present invention is also 
obtainable from colon cancer cells (COLON) and lung cancer cells (LL/2). {See, as-filed 
specification, at Example 6 starting at page 40, Figure 10, page 33, lines 19-21 and at page 54). 
Thus, in contrast to the enzyme of Gately et al., which is only isolatable from PC-3 cells, the 
presently claimed enzyme is isolatable from multiple, different cell lines. 

Dependent claims 3, 4, 11, 15 and 17 are not anticipated as, inter alia, depending from a 
non-anticipated base claim, claim 1 . 

Reconsideration and withdrawal of the anticipation rejection of claims 1, 3, 4, 11, 15 and 
1 7 are respectfully requested. 
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CONCLUSION 

If the Examiner has any questions or comments, please contact Thomas J. Siepmann, 
Ph.D., Registration No 57,374 at the offices of Birch, Stewart, Kolasch & Birch, LLP. 

If necessary, the Commissioner is hereby authorized in this, concurrent, and future 
replies, to charge payment or credit any overpayment to our Deposit Account No. 02-2448 for 
any additional fees required under 37 C.F.R. § 1.16 or under § 1.17; particularly, extension of 
time fees. 

Dated: April 18,2006 Respectfully submitted. 



^Mar(JS. Weini 
<TfK RegistratfonNA.: 32,181 

BIRCH,/STE#ART, KOLXSCH «& BIRCH, LLP 
81 10 Gatehouse Road 
Suite 100 East 
P.O. Box 747 

Falls Church, Virginia 22040-0747 
(703) 205-8000 
Attorney for Applicant 



Attachments: Exhibits A-D 
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INTRODUCTION 

Proteases are the single class of enzymes which occupy a 
pivotal position with respect to their applications in both phys- 
iological and commercial fields. Proteolytic enzymes catalyze 
the cleavage of peptide bonds in other proteins. Proteases are 
degradative enzymes which catalyze the total hydrolysis of pro- 
teins. Advances in analytical techniques have demonstrated 
that proteases conduct highly specific and selective modifica- 
tions of proteins such as activation of zymogenic forms of 



enzymes by limited proteolysis, blood clotting and lysis of fibrin 
clots, and processing and transport of secretory proteins across 
the membranes. The current estimated value of the worldwide 
sales of industrial enzymes is $1 billion (72). Of the industrial 
enzymes, 75% are hydrolytic. Proteases represent one of the 
three largest groups of industrial enzymes and account for 
about 60% of the total worldwide sale of enzymes (Fig. 1). 
Proteases execute a large variety of functions, extending from 
the cellular level to the organ and organism level, to produce 
cascade systems such as hemostasis and inflammation. They 
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FIG. 1 . Distribution of enzyme sales. The contribution of different enzymes 
to the total sale of enzymes is indicated. The shaded portion indicates the total 
sale of proteases. 



TABLE 2. Specificity of proteases 
Enzyme Peptide bond cleaved" 

Trypsin -Lys (or Arg) 

Chymotrypsin, subtilisin -Trp (or Tyr, Phe, Leu)--— 

Staphylococcus V8 protease -Asp (or GIu) — 

Papain -Phe (or Val, Leu)-Xaa-— - 

Thermolysin — - Leu (or Phe) 

Pepsin -Phe (or Tyr, Leu) ^ Trp (or Phe, Tyr) 

" The arrow indicates the site of action of the protease. Xaa, any amino acid 
residue. 



However, the major emphasis of the review is on the microbial 
proteases. 

SOURCES OF PROTEASES 

Since proteases are physiologically necessary for living or- 
ganisms, they are ubiquitous, being found in a wide diversity of 
sources such as plants, animals, and microorganisms. 



are responsible for the complex processes involved in the nor- 
mal physiology of the cell as well as in abnormal pathophysi- 
ological conditions. Their involvement in the life cycle of dis- 
ease-causing organisms has led them to become a potential 
target for developing therapeutic agents against fatal diseases 
such as cancer and AIDS. Proteases have a long history of 
application in the food and detergent industries. Their appli- 
cation in the leather industry for dehairing and bating of hides 
to substitute currently used toxic chemicals is a relatively new 
development and has conferred added biotechnological impor- 
tance (235). The vast diversity of proteases, in contrast to the 
specificity of their action, has attracted worldwide attention in 
attempts to exploit their physiological and biotechnological 
applications (64, 225). The major producers of proteases 
worldwide are listed in Table 1. 

SCOPE OF THE REVIEW 

Since proteases are enzymes of metabolic as well as com- 
mercial importance, there is a vast literature on their biochem- 
ical and biotechnological aspects (64, 128, 192, 235, 309). How- 
ever, the earlier reviews did not deal with the molecular 
biology of proteases, which offers new possibilities and poten- 
tials for their biotechnological applications. This review aims at 
analyzing the updated information on biochemical and genetic 
aspects of proteases, with special reference to some of the 
advances made in these areas. We also attempt to address 
some of the deficiencies in the earlier reviews and to identify 
problems, along with possible solutions, for the successful ap- 
plications of proteases for the benefit of mankind. The genetic 
engineering approaches are also discussed, from the perspec- 
tive of making better use of proteases. The reference to plant 
and animal proteases has been made to complete the overview. 



TABLE 1. Major protease producers 



Company 


Country 


Market share {%) 


Novo Industries 


Denmark 


40 


Gist-Brocades 


Netherlands 


20 


Genencor International 


United States 


10 


Miles Laboratories 


United States 


10 


Others 




20 



Plant Proteases 

The use of plants as a source of proteases is governed by 
several factors such as the availability of land for cultivation 
and the suitability of climatic conditions for growth. Moreover, 
production of proteases from plants is a time-consuming pro- 
cess. Papain, bromelain, keratinases, and ficin represent some 
of the well-known proteases of plant origin. 

Papain. Papain is a traditional plant protease and has a long 
history of use (250). It is extracted from the latex of Carica 
papaya fruits, which are grown in subtropical areas of west and 
central Africa and India. The crude preparation of the enzyme 
has a broader specificity due to the presence of several pro- 
teinase and peptidase isozymes. The performance of the en- 
zyme depends on the plant source, the climatic conditions for 
growth, and the methods used for its extraction and purifica- 
tion. The enzyme is active between pH 5 and 9 and is stable up 
to 80 or 9Q°C in the presence of substrates. It is extensively 
used in industry for the preparation of highly soluble and 
flavored protein hydrolysates. 

Bromelain. Bromelain is prepared from the stem and juice 
of pineapples. The major supplier of the enzyme is Great Food 
Biochem,, Bangkok, Thailand. The enzyme is characterized as 
a cysteine protease and is active from pH 5 to 9. Its inactivation 
temperature is 70°C, which is lower than that of papain. 

Keratinases. Some of the botanical groups of plants produce 
proteases which degrade hair. Digestion of hair and wool is 
important for the production of essential amino acids such as 
lysine and for the prevention of clogging of wastewater sys- 
tems. 

Animal Proteases 

The most familiar proteases of animal origin are pancreatic 
trypsin, chymotrypsin, pepsin, and rennins (23, 97). These are 
prepared in pure form in bulk quantities. However, their pro- 
duction depends on the availability of livestock for slaughter, 
which in turn is governed by political and agricultural policies. 

Trypsin. Trypsin {M^. 23,300) is the main intestinal digestive 
enzyme responsible for the hydrolysis of food proteins. It is a 
serine protease and hydrolyzes peptide bonds in which the 
carboxyl groups are contributed by the lysine and arginine 
residues (Table 2). Based on the ability of protease inhibitors 
to inhibit the enzyme from the insect gut, this enzyme has 
received attention as a target for biocontrol of insect pests. 
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Trypsin has limited applications in the food industry, since the 
protein hydro lysates generated by its action have a highly bitter 
taste. Trypsin is used in the preparation of bacterial media and 
in some specialized medical applications. 

Chymotrypsin. Chymotryp.sin (A/^ 23,800) is found in animal 
pancreatic extract. Pure chymotrypsin is an expensive enzyme 
and is used only for diagnostic and analytical applications. It is 
specific for the hydrolysis of peptide bonds in which the car- 
boxy 1 groups are provided by one of the three aromatic amino 
acids, i.e., phenylalanine, tyrosine, or tryptophan. It is used 
extensively in the deallergenizing of milk protein hydrolysates. 
It is stored in the pancreas in the form of a precursor, chymo- 
trypsinogen, and is activated by trypsin in a multistep process. 

Pepsin. Pepsin {M^ 34,500) is an acidic protease that is found 
in the stomachs of almost all vertebrates. The active enzyme is 
released from its zymogen, i.e., pepsinogen, by autocatalysis in 
the presence of hydrochloric acid. Pepsin is an aspartyl pro- 
tease and resembles human immunodeficiency virus type 1 
(HJV-1) protease, responsible for the maturation of HIV-1. It 
exhibits optimal activity between pH 1 and 2, while the optimal 
pH of the stomach is 2 to 4. Pepsin is inactivated above pH 6.0. 
The enzyme catalyzes the hydrolysis of peptide bonds between 
two hydrophobic amino acids. 

Rennin. Rennet is a pepsin-like protease (rennin, chymosin; 
EC 3.4.23.4) that is produced as an inactive precursor, proren- 
nin, in the stomachs of all nursing mammals. It is converted to 
active rennin (A/j. 30,700) by the action of pepsin or by its 
autocatalysis. It is used extensively in the dairy industry to 
produce a stable curd with good flavor. The specialized nature 
of the enzyme is due to its specificity in cleaving a single 
peptide bond in K-casein to generate insoluble para-K-casein 
and C-terminal glycopeptide. 



Microbial Proteases 

The inability of the plant and animal proteases to meet 
current world demands has led to an increased interest in 
microbial proteases. Microorganisms represent an excellent 
source of enzymes owing to their broad biochemical diversity 
and their susceptibility to genetic manipulation. Microbial pro- 
teases account for approximately 40% of the total worldwide 
enzyme sales (72). Proteases from microbial sources are pre- 
ferred to the enzymes from plant and animal sources since they 
possess almost all the characteristics desired for their biotech- 
nological applications. 

Bacteria. Most commercial proteases, mainly neutral and 
alkaline, are produced by organisms belonging to the genus 
Bacillus. Bacterial neutral proteases are active in a narrow pH 
range (pH 5 to 8) and have relatively low thermotolerance. 
Due to their intermediate rate of reaction, neutral proteases 
generate less bitterness in hydrolyzed food proteins than do 
the animal proteinases and hence are valuable for use in the 
food industry, Neutrase, a neutral protease, is insensitive to the 
natural plant proteinase inhibitors and is therefore useful in 
the brewing industry. The bacterial neutral proteases are char- 
acterized by their high affinity for hydrophobic amino acid 
pairs. Their low thermotolerance is advantageous for control- 
ling their reactivity during the production of food hydrolysates 
with a low degree of hydrolysis. Some of the neutral proteases 
belong to the metalloprotease type and require divalent metal 
ions for their activity, while others are serine proteinases, 
which are not affected by chelating agents. 

Bacterial alkaline proteases are characterized by their high 
activity at alkaline pH, e.g., pH 10, and their broad substrate 
specificity. Their optimal temperature is around 60*'C. These 



properties of bacterial alkaline proteases make them suitable 
for use in the detergent industry. 

Fungi. Fungi elaborate a wider variety of enzymes than do 
bacteria. For example, Aspergillus oiyzae produces acid, neu- 
tral, and alkaline proteases. The fungal proteases are active 
over a wide pH range (pH 4 to 11) and exhibit broad substrate 
specificity. However, they have a lower reaction rate and worse 
heat tolerance than do the bacterial enzymes. Fungal enzymes 
can be conveniently produced in a solid-state fermentation 
process. Fungal acid proteases have an optimal pH between 4 
and 4.5 and are stable between pH 2.5 and 6.0. They are 
particularly useful in the cheesemaking industry due to their 
narrow pH and temperature specificities. Fungal neutral pro- 
teases are metalloproteases that are active at pH 7.0 and are 
inhibited by chelating agents. In view of the accompanying 
peptidase activity and their specific function in hydrolyzing 
hydrophobic amino acid bonds, fungal neutral proteases sup- 
plement the action of plant, animal, and bacterial proteases in 
reducing the bitterness of food protein hydrolysates. Fungal 
alkaline proteases are also used in food protein modification. 

Viruses. Viral proteases have gained importance due to their 
functional involvement in the processing of proteins of viruses 
that cause certain fatal diseases such as AIDS and cancer. 
Serine, aspartic, and cysteine peptidases are found in various 
viruses (236). All of the virus-encoded peptidases are endopep- 
tidases; there are no metallopeptidases. Retroviral aspartyl 
proteases that are required for viral assembly and replication 
are homodimers and are expressed as a part of the polyprotein 
precursor. The mature protease is released by autolysis of the 
precursor. An extensive literature is available on the expres- 
sion, purification, and enzymatic analysis of retroviral aspartic 
protease and its mutants (147). Extensive research has focused 
on the three-dimensional structure of viral proteases and their 
interaction with synthetic inhibitors with a view to designing 
potent inhibitors that can combat the relentlessly spreading 
and devastating epidemic of AIDS. 

Thus, ahhough proteases are widespread in nature, mi- 
crobes serve as a preferred source of these enzymes because of 
their rapid growth, the limited space required for their culti- 
vation, and the ease with which they can be genetically manip- 
ulated to generate new enzymes with altered properties that 
are desirable for their various applications. 

CLASSIFICATION OF PROTEASES 

According to the Nomenclature Committee of the Interna- 
tional Union of Biochemistry and Molecular Biology, pro- 
teases are classified in subgroup 4 of group 3 (hydrolases) 
(114a). However, proteases do not comply easily with the gen- 
eral system of enzyme nomenclature due to their huge diversity 
of action and structure. Currently, proteases are classified on 
the basis of three major criteria: (i) type of reaction catalyzed, 
(ii) chemical nature of the catalytic site, and (iii) evolutionary 
relationship with reference to structure (12). 

Proteases are grossly subdivided into two major groups, i.e., 
exopeptidases and endopeptidases, depending on their site of 
action. Exopeptidases cleave the peptide bond proximal to the 
amino or carboxy termini of the substrate, whereas endopep- 
tidases cleave peptide bonds distant from the termini of the 
substrate. Based on the functional group present at the active 
site, proteases are further classified into four prominent 
groups, i.e., serine proteases, aspartic proteases, cysteine pro- 
teases, and metalloproteases (85). There are a few miscella- 
neous proteases which do not precisely fit into the standard 
classification, e.g., ATP-dependent proteases which require 
ATP for activity (183). Based on their amino acid sequences, 
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TABLE 3. Classification of proteases 



Protease 


Mode of action" 


EC no. 
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Cysteine type protease 
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3.4.15 


Dipeptidases 
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3.4.19 


Endopeptidases 
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3.4.21-3.4.34 


Serine protease 




3.4.21 


Cysteine protease 




3.4.22 


Aspartic protease 




3.4.23 


Metalloprotease 




3.4.24 


Endopeptidases of unknown 




3.4,99 



catalytic mechanism 



" Open circles represent the amino acid residues in the polypeptide chain. 
Solid circles indicate the terminal amino acids, and stars signi^ the blocked 
termini. Arrows show the sites of action of the enz)'me. 



proteases are classified into different families (5) and further 
subdivided into "clans" to accommodate sets of peptidases that 
have diverged from a common ancestor (236). Each family of 
peptidases has been assigned a code letter denoting the type of 
catalysis, i.e., S, C, A, M, or U for serine, cysteine, aspartic, 
metallo-, or unknown type, respectively. 

Exopeptidases 

The exopeptidases act only near the ends of polypeptide 
chains. Based on their site of action at the N or C terminus, 
they are classified as amino- and carboxypeptidases, respec- 
tively. 

Aminopeptidases. Aminopeptidases act at a free N terminus 
of the polypeptide chain and liberate a single amino acid res- 
idue, a dipeptide, or a tripeptide (Table 3). They are known to 
remove the N-terminal Met that may be found in heterolo- 
gously expressed proteins but not in many naturally occurring 
mature proteins. Aminopeptidases occur in a wide variety of 
microbial species including bacteria and fungi (310). In gen- 
eral, aminopeptidases are intracellular enzymes, but there has 
been a single report on an extracellular aminopeptidase pro- 
duced by A. oryzae (150). The substrate specificities of the 
enzymes from bacteria and fungi are distinctly different in that 
the organisms can be differentiated on the basis of the profiles 
of the products of hydrolysis (31). Aminopeptidase I from 
Escherichia coli is a large protease (400,000 l3a). It has a broad 
pH optimum of 7.5 to 10.5 and requires Mg^"^ or Mn^"^ for 
optimal activity (48). The BaciUus lichen if orviis aminopepti- 
dase has a molecular weight of 34,000. It contains 1 g-atom of 
Zn^^ per mol, and its activity is enhanced by Co^"^ ions. On the 
other hand, aminopeptidase II from B. stearothermophilus is a 
dimer with a molecular weight of 80,000 to 100,000 (272) and 
is activated by Zxr^, Mn^"^, or Co^"^ ions. 

Carboxypeptidases. The carboxypeptidases act at C termi- 
nals of the polypeptide chain and liberate a single amino acid 
or a dipeptide. Carboxypeptidases can be divided into three 
major groups, serine carboxypeptidases, metallocarboxypepti- 
dases, and cysteine carboxypeptidases, based on the nature of 
the amino acid residues at the active site of the enzymes. The 



serine carboxypeptidases isolated from Penici Ilium spp,, Sac- 
cliaromyces spp., and Aspergillus spp. are similar in their sub- 
strate specificities but differ slightly in other properties such as 
pH optimum, stability, molecular weight, and effect of inhibi- 
tors. Metal locarboxypept id ases from Saccharomyces spp. (61) 
and Pseudomonas spp. (174) require Zvr^ or Co""^ for their 
activity. The enzymes can also hydrolyze the peptides in which 
the peptidyl group is replaced by a pteroyl moiety or by acyl 
groups, 

Endopeptidases 

Endopeptidases are characterized by their preferential ac- 
tion at the peptide bonds in the inner regions of the polypep- 
tide chain away from the N and C termini. The presence of the 
free amino or carboxyl group has a negative influence on en- 
zyme activity. The endopeptidases are divided into four sub- 
groups based on their catalytic mechanism, (i) serine pro- 
teases, (ii) aspartic proteases, (iii) cysteine proteases, and (iv) 
metalloproteases. To facilitate quick and unambiguous refer- 
ence to a particular family of peptidases, Rawlings and Barrett 
have assigned a code letter denoting the catalytic type, i.e., S, 
C, A, M, or U (see above) followed by an artibrarily assigned 
number (236). 

Serine proteases. Serine proteases are characterized by the 
presence of a serine group in their active site. They are nu- 
merous and widespread among viruses, bacteria, and eu- 
karyotes, suggesting that they are vital to the organisms. Serine 
proteases are found in the exopeptidase, endopeptidase, oli- 
gopeptidase, and omega peptidase groups. Based on their 
structural similarities, serine proteases have been grouped into 
20 families, which have been further subdivided into about six 
clans with common ancestors (12). The primary structures of 
the members of four clans, chymotrypsin (SA), subtilisin (SB), 
carboxypeptidase C (SC), and Escherichia D-AIa-D-Ala pepti- 
dase A (SE) are totally unrelated, suggesting that there are at 
least four separate evolutionary origins for serine proteases. 
Clans SA, SB, and SC have a common reaction mechanism 
consisting of a common catalytic triad of the three amino acids, 
serine (nucleophile), aspartate (electrophile), and histidine 
(base). Although the geometric orientations of these residues 
are similar, the protein folds are quite different, forming a 
typical example of a convergent evolution. The catalytic mech- 
anisms of clans SE and SF (repressor LexA) are distinctly 
different from those of clans SA, SB, and SE, since they lack 
the classical Ser-His-Asp triad. Another interesting feature of 
the serine proteases is the conservation of glycine residues in 
the vicinity of the catalytic serine residue to form the motif 
Gly-Xaa-Ser-Yaa-Gly (25). 

Serine proteases are recognized by their irreversible inhibi- 
tion by 3,4-dichloroisocoumarin (3,4-DCI), L-3-carboxytrans 
2,3-epoxypropyl-leucylamido (4-guanidine) butane (E.64), di- 
isopropylfluorophosphate (DFP), phenylmethylsulfonyl fluo- 
ride (PMSF) and tosyl-L-lysine chloromethyl ketone (TLCK). 
Some of the serine proteases are inhibited by thiol reagents 
such asp-chloromercuribenzoate (PCMB) due to the presence 
of a cysteine residue near the active site. Serine proteases are 
generally active at neutral and alkaline pH, with an optimum 
between pH 7 and 11. They have broad substrate specificities 
including esterolytic and amidase activity. Their molecular 
masses range between 18 and 35 kDa, for the serine protease 
from Blakeslea trispora, which has a molecular mass of 126 kDa 
(76). The isoelectric points of serine proteases are generally 
between pH 4 and 6. Serine alkaline proteases that are active 
at highly alkaline pH represent the largest subgroup of serine 
proteases. 
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(i) Serine alkaline proteases. Serine alkaline proteases are 
produced by several bacteria, molds, yeasts, and fungi. They 
are inhibited by DFP or a potato protease inhibitor but not by 
tosyl-L-phenylalanine chloromethyl ketone (TPCK) or TLCK. 
Their substrate specificity is similar to but less stringent than 
that of chymotrypsin. They hydrolyze a peptide bond which has 
tyrosine, phenylalanine, or leucine at the carboxyl side of the 
splitting bond. The optimal pH of alkaline proteases is around 
pH 10, and their isoelectric point is around pH 9. Their mo- 
lecular masses are in the range of 15 to 30 kDa. Although 
alkaline serine proteases are produced by several bacteria such 
as Arthrobacter, Streptomyces, and Flavobacterium spp. (21), 
subtilisins produced by Bacillus spp. are the best known. Al- 
kaline proteases are also produced by 5. cerevisiae (189) and 
filamentous fungi such as Conidiobolus spp. (219) and Aspergil- 
lus and Neurospora spp. (165). 

(ii) Subtilisins. Subtilisins of Bacillus origin represent the 
second largest family of serine proteases. Two different types 
of alkaline proteases, subtilisin Carlsberg and subtilisin Novo 
or bacterial protease Nagase (BPN'), have been identified. 
Subtilisin Carlsberg produced by Bacillus licheniformis was dis- 
covered in 1947 by Linderstrom, Lang, and Ottesen at the 
Carlsberg laboratory. Subtilisin Novo or BPN' is produced by 
Bacillus amyioUquefaciens. Subtilisin Carlsberg is widely used 
in detergents. Its annual production amounts to about 500 tons 
of pure enzyme protein. Subtilisin BPN' is less commercially 
important. Both subtilisins have a molecular mass of 27.5 kDa 
but differ from each other by 58 amino acids. They have similar 
properties such as an optimal temperature of 60°C and an 
optimal pH of 10. Both enzymes exhibit a broad substrate 
specificity and have an active-site triad made up of Ser221, 
His64 and Asp32. The Carlsberg enzyme has a broader sub- 
strate specificity and does not depend on Ca^"^ for its stability. 
The active-site conformation of subtilisins is similar to that of 
trypsin and chymotrypsin despite the dissimilarity in their over- 
all molecular arrangements. The serine alkaline protease from 
the fungus Conidiobolus coronatus was shown to possess a 
distinctly different structure from subtilisin Carlsberg in spite 
of their functional similarities (218). 

Aspartic proteases. Aspartic . acid proteases, commonly 
known as acidic proteases, are the endopeptidases that depend 
on aspartic acid residues for their catalytic activity. Acidic 
proteases have been grouped into three famihes, namely, pep- 
sin (Al), retropepsin (A2), and enzymes from pararetroviruses 
(A3) (13), and have been placed in clan AA. The members of 
families Al and A2 are known to be related to each other, 
while those of family A3 show some relatedness to Al and A2. 
Most aspartic proteases show maximal activity at low pH (pH 
3 to 4) and have isoelectric points in the range of pH 3 to 4.5. 
Their molecular masses are in the range of 30 to 45 kDa. The 
members of the pepsin family have a bilobal structure with the 
active-site cleft located between the lobes (259). The active-site 
aspartic acid residue is situated within the motif Asp-Xaa-Gly, 
in which Xaa can be Ser or Thr. The aspartic proteases are 
inhibited by pepstatin (63). They are also sensitive to diazok- 
etone compounds such as diazoacetyl-DL-norleucine methyl 
ester (DAN) and l,2-epoxy-3-0:)-nitrophenoxy)propane (EPNP) 
in the presence of copper ions. Microbial acid proteases exhibit 
specificity against aromatic or bulky amino acid residues on 
both sides of the peptide bond, which is similar to pepsin, but 
their action is less stringent than that of pepsin. Microbial 
aspartic proteases can be broadly divided into two groups, (i) 
pepsin -like enzymes produced hy Aspergillus, Penicillium, Rhi- 
zopus, and Neurospora and (ii) rennin-like enzymes produced 
by Endothia and Mucor spp. 
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Cysteine/thiol proteases. Cysteine proteases occur in both 
prokaryotes and eukaryotes. About 20 families of cysteine pro- 
teases have been recognized. The activity of all cysteine pro- 
teases depends on a catalytic dyad consisting of cysteine and 
histidine. The order of Cys and His (Cys-His or His-Cys) res- 
idues differs among the families (12), Generally, cysteine pro- 
teases are active only in the presence of reducing agents such 
as HCN or cysteine. Based on their side chain specificity, they 
are broadly divided into four groups: (i) papain-like, (ii) tryp- 
sin-like with preference for cleavage at the arginine residue, 
(iii) specific to glutamic acid, and (iv) others. Papain is the 
best-known cysteine protease. Cysteine proteases have neutral 
pH optima, although a few of them, e.g., lysosomal proteases, 
are maximally active at acidic pH. They are susceptible to 
sulfhydryl agents such as PCMB but are unaffected by DFP and 
metal-chelating agents. Clostripain, produced by the anaerobic 
bacterium Clostridium histolyticum, exhibits a stringent speci- 
ficity for arginyl residues at the carboxyl side of the splitting 
bond and differs from papain in its obligate requirement for 
calcium. Streptopain, the cysteine protease produced by Strep- 
tococcus spp., shows a broader specificity, including oxidized 
insulin B chain and other synthetic substrates. Clostripain has 
an isoelectric point of pH 4.9 and a molecular mass of 50 kDa, 
whereas the isoelectric point and molecular mass of strep- 
topain are pH 8,4 and 32 kDa, respectively, 

Metalloproteases. Metalloproteases are the most diverse of 
the catalytic types of proteases (13). They are characterized by 
the requirement for a divalent metal ion for their activity. They 
include enzymes from a variety of origins such as collagenases 
from higher organisms, hemorrhagic toxins from snake ven- 
oms, and thermolysin from bacteria (92, 210, 253, 311, 314). 
About 30 families of metalloproteases have been recognized, 
of which 17 contain only endopeptidases, 12 contain only ex- 
opeptidases, and 1 (M3) contains both endo- and exopepti- 
dases. Families of metalloproteases have been grouped into 
different clans based on the nature of the amino acid that 
completes the metal-binding site; e.g., clan MA has the se- 
quence HEXXH-E and clan MB corresponds to the motif 
HEXXH-H. In one of the groups, the metal atom binds at a 
motif other than the usual motif. 

Based on the specificity of their action, metalloproteases can 
be divided into four groups, (i) neutral, (ii) alkaline, (iii) Myoc- 
obacter I, and (iv) Myxobacter II. The neutral proteases show 
specificity for hydrophobic amino acids, while the alkaline pro- 
teases possess a very broad specificity. Myxobacter protease I is 
specific for small amino acid residues on either side of the 
cleavage bond, whereas protease II is specific for lysine residue 
on the amino side of the peptide bond. All of them are inhib- 
ited by chelating agents such as EDTA but not by sulfhydryl 
agents or DFP. 

Thermolysin, a neutral protease, is the most thoroughly 
characterized member of clan MA. Histidine residues from the 
HEXXH motif sei-ve as Zn ligands, and Glu has a catalytic 
function (311). Thermolysin produced by stearothermophilus 
is a single peptide without disulfide bridges and has a molec- 
ular mass of 34 kDa. It contains an essential Zn atom embed- 
ded in a cleft formed between two folded lobes of the protein 
and four Ca atoms which impart thermostability to the protein. 
Thermolysin is a very stable protease, with a half-life of 1 h at 
80"C. 

Collagenase, another important metalloprotease, was first 
discovered in the broth of the anaerobic bacterium Clostridium 
hystolyticum as a component of toxic products. Later, it was 
found to be produced by the aerobic hdiCi^nnm Achromobacter 
iophagiis and other microorganisms including fungi. The action 
of collagenase is very specific; i.e., it acts only on collagen and 
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Proteose: N Sn— -Sj-Sg-S, ^ s', - s'g— S3 C 

Substrate; N Pn — - P^— - P, -j- P,' - Pg — P3 Pn C 

FIG. 2. Active sites of proteases. The catalytic site of proteases is indicated 
by * and the scissile bond is indicated by -i-; SI through Sn and ST through Sn' 
are the specificity subsites on the enzyme, while PI through Pn and PI ' through 
Pn' are the residues on the substrate accommodated by the subsites on the 
enzyme. 



gelatin and not on any of the other usual protein substrates. 
Elastase produced by Pseudomonas aeruginosa is another im- 
portant member of the neutral metalloprotease family. 

The alkaline metal loproteases produced by Pseudomonas 
aeruginosa and Serratia spp. are active in the pH range from 7 
to 9 and have molecular masses in the region of 48 to 60 kDa. 
Myxobacter protease I has a pH optimum of 9.0 and a molec- 
ular mass of 14 kDa and can lyse cell walls of Aiihrohacter 
crystellopoites, whereas protease II cannot lyse the bacterial 
cells. Matrix metalloproteases play a prominent role in the 
degradation of the extracellular matrix during tissue morpho- 
genesis, differentiation, and wound healing and may be useful 
in the treatment of diseases such as cancer and arthritis (26). 

In summary, proteases are broadly classified as endo- or 
exoenzymes on the basis of their site of action on protein 
substrates. They are further categorized as serine proteases, 
aspartic proteases, cysteine proteases, or metalloproteases de- 
pending on their catalytic mechanism. They are also classified 
into different families and clans depending on their amino acid 
sequences and evolutionary relationships. Based on the pH of 
their optimal activity, they are also referred to as acidic, neu- 
tral, or alkaline proteases. 

MECHANISM OF ACTION OF PROTEASES 

The mechanism of action of proteases has been a subject of 
great interest to researchers. Purification of proteases to ho- 
mogeneity is a prerequisite for studying their mechanism of 
action. Vast numbers of purification procedures for proteases, 
involving affinity chromatography, ion-exchange chromatogra- 
phy, and gel filtration techniques, have been well documented. 
Preparative polyacrylamide gel electrophoresis has been used 
for the purification of proteases from Conidiobolus coronatus 
(220). Purification of staphylocoagulase to homogeneity was 
carried out from culture filtrates of Staphylococcus aureus by 
affinity chromatography with a bovine prothrombin-Sepharose 
4B column (109) and gel filtration (335). A number of peptide 
hydrolases have been isolated and purified from E. coli by 
DEAE-cellulose chromatography (217). 

The catalytic site of proteases is flanked on one or both sides 
by specificity subsites, each able to accommodate the side chain 
of a single amino acid residue from the substrate. These sites 
are numbered from the catalytic site SI through Sn toward the 
N terminus of the structure and Sr through Sn' toward the C 
terminus. The residues which they accommodate from the sub- 
strate are numbered PI through Pn and ?V through Pn', re- 
spectively (Fig. 2). 

Serine Proteases 

Serine proteases usually follow a two-step reaction for hy- 
drolysis in which a covalently linked enzyme-peptide interme- 
diate is formed with the loss of the amino acid or peptide 
fragment (60). This acylation step is followed by a deacylation 
process which occurs by a nucleophilic attack on the interme- 
diate by water, resulting in hydrolysis of the peptide. Serine 



MICROBIAL PROTEASES 603 

endopeptidases can be classified into three groups based 
mainly on their primaiy substrate preference: (i) trypsin-like, 
which cleave after positively charged residues; (ii) chymotryp- 
sin-like, which cleave after large hydrophobic residues; and 
(iii) elastase-like, which cleave after small hydrophobic resi- 
dues. The PI residue exclusively dictates the site of peptide 
bond cleavage. The primary specificity is aff'ected only by the PI 
residues; the residues at other positions affect the rate of cleav- 
age. The subsite interactions are localized to specific amino 
acids around the PI residue to a unique set of sequences on the 
enzyme. Some of the serine peptidases from Achromohacter 
spp. are lysine-specific enzymes (179), whereas those from 
Clostridium spp. are arginine specific (clostripain) (71) and 
those from Flavobacterium spp. are post proline-specific (329). 
Endopeptidases that are specific to glutamic acid and aspartic 
acid residues have also been found in B, lichenifornns and S. 
aureus (52). 

The recent studies based on the three-dimensional struc- 
tures of proteases and comparisons of amino acid sequences 
near the primary substrate-binding site in trypsin-like pro- 
teases of viral and bacterial origin suggest a putative general 
substrate binding scheme for proteases with specificity towards 
glutamic acid involving a histidine residue and a hydroxyl func- 
tion. However, a few other serine proteases such as peptidase 
A from E. coli and the repressor LexA show distinctly different 
mechanism of action without the classic Ser-His-Asp triad (12). 
Some of the glycine residues are conserved in the vicinity of the 
catalytic serine residue, but their exact positions are variable 
(25). 

The chymotrypsin-like enzymes are confined almost entirely 
to animals, the exceptions being trypsin-like enzymes from 
actinomycetes and Saccharopolyspora spp. and from the fungus 
Fusarium oxysporum. 

A few of the serine proteases belonging to the subtilisin 
family show a catalytic triad composed of the same residues as 
in the chymotrypsin family; however, the residues occur in a 
different order (Asp-His-Ser). Some members of the subtilisin 
family from the yeasts Tritirachium and Metarhizium spp. re- 
quire thiol for their activity. The thiol dependance is attribut- 
able to Cysl73 near the active-site histidine (122). 

The carboxypeptidases are unusual among the serine-depen- 
dent enzymes in that they are maximally active at acidic pH. 
These enzymes are known to possess a Glu residue preceding 
the catalytic Ser, which is believed to be responsible for their 
acidic pH optimum. Although the majority of the serine pro- 
teases contain the catalytic triad Ser-His-Asp, a few use the 
Ser-base catalytic dyad. The Glu-specific proteases display a 
pronounced preference for Glu-Xaa bonds over Asp-Xaa 
bonds (8). 

Aspartic Proteases 

Aspartic endopeptidases depend on the aspartic acid resi- 
dues for their catalytic activity. A general base catalytic mech- 
anism has been proposed for the hydrolysis of proteins by 
aspartic proteases such as penicillopepsin (121) and endothia- 
pepsin (215). Crystallographic studies have shown that the 
enzymes of the pepsin family are bilobed molecules with the 
active-site cleft located between the lobes and each lobe con- 
tributing one of the pair of aspartic acid residues that is essen- 
tial for the catalytic activity (20, 259). The lobes are homolo- 
gous to one another, having arisen by gene duplication. The 
retropepsin molecule has only one lobe, which carries only one 
aspartic residue, and the activity requires the formation of a 
noncovalent homodimer (184). In most of the enzymes from 
the pepsin family, the catalytic Asp residues are contained in 
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FIG. 3. Mechanism of action of proteases. (A) Aspartic proteases. (B) Cys- 
teine proteases. Im and ^Hlm refer to the imidazole and protonated imidazole, 
respectively. 

an Asp-Thr-GIy-Xaa motif in both the N- and C-terminal lobes 
of the enzyme, where Xaa is Ser or Thr, whose side chains can 
hydrogen bond to Asp. However, Xaa is Ala in most of the 
retropepsins. A marked conservation of cysteine residue is also 
evident in aspartic proteases. The pepsins and the majority of 
other members of the family show specificity for the cleavage 
of bonds in peptides of at least six residues with hydrophobic 
amino acids in both the PI and PI' positions (132). 

The specificity of the catalysis has been explained on the 
basis of available crystal structures (166). The structural and 
kinetic studies also have suggested that the mechanism in- 
volves general acid-base catalysis with lytic water molecule that 
directly participates in the reaction (Fig. 3A). This is supported 
by the crystal structures of various aspartic protease-inhibitor 
complexes and by the thiol inhibitors mimicking a tetrahedral 
intermediate formed after the attack by the lytic water mole- 
cule (120). 



Metalloproteases 

The mechanism of action of metalloproteases is slightly dif- 
ferent from that of the above-described proteases. These en- 
zymes depend on the presence of bound divalent cations and 
can be inactivated by dialysis or by the addition of chelating 
agents. For thermolysin, based on the X-ray studies of the 
complex with a hydroxamic acid inhibitor, it has been proposed 
that Glul43 assists the nucleophilic attack of a water molecule 
on the carbonyl carbon of the scissile peptide bond, which is 
polarized by the Zn""^ ion (98). Most of the metalloproteases 
are enzymes containing the His-Glu-Xaa-Xaa-His (HEXXH) 
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motif, which has been shown by X-ray crystallography to form 
a part of the site for binding of the metal, usually zinc. 

Cysteine Proteases 

Cysteine proteases catalyze the hydrolysis of carboxylic acid 
derivatives through a double-displacement pathway involving 
general acid-base formation and hydrolysis of an acyl-thiol 
intermediate. The mechanism of action of cysteine proteases is 
thus very similar to that of serine proteases. 

A striking similarity is also observed in the reaction mecha- 
nism for several peptidases of different evolutionary origins. 
The plant peptidase papain can be considered the archetype of 
cysteine peptidases and constitutes a good model for this fam- 
ily of enzymes. They catalyze the hydrolysis of peptide, amide 
ester, thiol ester, and thiono ester bonds (226). The initial step 
in the catalytic process (Fig. 3B) involves the noncovalent 
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binding of the free enzyme (structure a) and the substrate to 
form the complex (structure b). This is followed by the acyla- 
tion of the enzyme (structure c), with the formation and re- 
lease of the first product, the amine R'-NH2. In the next 
deacylation step, the acyl-enzyme reacts with a water molecule 
to release the second product, with the regeneration of free 
enzyme. 

The enzyme papain consists of a single protein chain folded 
to form two domains containing a cleft for the substrate to 
bind. The crystal structure of papain confirmed the Cys25- 
Hisl59 pairing (11). The presence of a conserved aspargine 
residue (Asnl75) in the proximity of catalytic histidine 
(Hisl59) creating a Cys-His-Asn triad in cysteine peptidases is 
considered analogous to the Ser-His-Asp arrangement found 
in serine proteases. 

Studies of the mechanism of action of proteases have re- 
vealed that they exhibit different types of mechanism based on 
their active-site configuration. The serine proteases contain a 
Ser-His-Asp catalytic triad, and the hydrolysis of the peptide 
bond involves an acylation step followed by a deacylation step, 
Aspartic proteases are characterized by an Asp-Thr-Gly motif 
in their active site and by an acid-base catalysis as their mech- 
anisms of action. The activity of metalloproteases depends on 
the binding of a divalent metal ion to a His-Glu-Xaa-Xaa-His 
motif. Cysteine proteases adopt a hydrolysis mechanism involv- 
ing a general acid-base formation followed by hydrolysis of an 
acyl-thiol intermediate. 

PHYSIOLOGICAL FUNCTIONS OF PROTEASES 

Proteases execute a large variety of complex physiological 
functions. Their importance in conducting the essential meta- 
bolic and regulatory functions is evident from their occurrence 
in all forms of living organisms. Proteases play a critical role in 
many physiological and pathological processes such as protein 
catabolism, blood coagulation, cell growth and migration, tis- 
sue arrangement, morphogenesis in development, inflamma- 
tion, tumor growth and metastasis, activation of zymogens, 
release of hormones and pharmacologically active peptides 
from precursor proteins, and transport of secretory proteins 
across membranes. In general, extracellular proteases catalyze 
the hydrolysis of large proteins to smaller molecules for sub- 
sequent absorption by the cell whereas intracellular proteases 
play a critical role in the regulation of metabolism. In contrast 
to the multitude of the roles contemplated for proteases, our 
knowledge about the mechanisms by which they perform these 
functions is very limited. Extensive research is being carried 
out to unravel the metabolic pathways in which proteases play 
an integral role; this research will continue to contribute sig- 
nificantly to our present state of information. Some of the 
major activities in which the proteases participate are de- 
scribed below. 

Protein Turnover 

All living cells maintain a particular rate of protein turnover 
by continuous, albeit balanced, degradation and synthesis of 
proteins. Catabolism of proteins provides a ready pool of 
amino acids as precursors of the synthesis of proteins. Intra- 
cellular proteases are known to participate in executing the 
proper protein turnover for the cell. In E. coli, ATP-dependent 
protease La, the Ion gene product, is responsible for hydrolysis 
of abnormal proteins (38). The turnover of intracellular pro- 
teins in eukaiyotes is also affected by a pathway involving 
ATP-dependent proteases (91). Evidence for the participation 
of proteolytic activity in controlling the protein turnover was 
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demonstrated by the lack of proper turnover in protease-defi- 
cient mutants. 

Sporulation and Conidial Discharge 

The formation of spores in bacteria (142), ascospores in 
yeasts (58), fruiting bodies in slime molds (205) and conidial 
discharge in fungi (221) all involve intensive protein turnover. 
The requirement of a protease for sporulation has been dem- 
onstrated by the use of protease inhibitors (41). Ascospore 
formation in yeast diploids was shown to be related to the 
increase in protease A activity (58). Extensive protein degra- 
dation accompanied the formation of a fruiting body and its 
differentiation to a stalk in slime molds. The alkaline serine 
protease of Conidioholus coronatus was shown to be involved in 
forcible conidial discharge by isolation of a mutant with less 
conidial formation (221). Formation of the less active protease 
by autoproteolysis represents a novel means of physiological 
regulation of protease activity in C. coronatus (219). 

Germination 

The dormant spores lack the amino acids required for ger- 
mination. Degradation of proteins in dormant spores by serine 
endoproteinases makes amino acids and nitrogen available for 
the biosynthesis of new proteins and nucleotides. These pro- 
teases are specific only for storage proteins and do not affect 
other spore proteins. Their activity is rapidly lost on germina- 
tion of the spores (227). Microconidal germination and hyphal 
fusion also involve the participation of a specific alkaline serine 
protease (159), Extracellular acid proteases are believed to be 
involved in the breakage of cell wall polypeptide linkages dur- 
ing germination of Dictyosteliiim discoideum spores (118) and 
Polysphondyliwn pallidum microcysts (206). 

Enzyme Modification 

Activation of the zymogenic precursor forms of enzymes and 
proteins by specific proteases represents an important step in 
the physiological regulation of many rate-controlling processes 
such as generation of protein hormones, assembly of fibrils and 
viruses, blood coagulation, and fertilization of ova by sperm. 
Activation of zymogenic forms of chitin synthase by limited 
proteolysis has been observed in Candida albicans, Mucor 
rouxii, and Aspergillus nidulans. Kex-2 protease (kexin; EC 
3.4.21.61), originally discovered in yeast, has emerged as a 
prototype of a family of eukaryotic precursor processing en- 
zymes. It catalyzes the hydrolysis of prohormones and of inte- 
gral membrane proteins of the secretory pathway by specific 
cleavage at the carboxyl side of pairs of basic residues such as 
Lys-Arg or Arg-Arg (12). Furin (EC 3.4.21.5) is a mammalian 
homolog of the ICex-2 protease that was discovered serendipi- 
tously and has been shown to catalyze the hydrolysis of a wide 
variety of precursor proteins at Arg-X-Lys or Arg-Arg sites 
within the constitutive secretory pathway (266). Pepsin, tryp- 
sin, and chymotrypsin occur as their inactive zymogenic forms, 
which are activated by the action of proteases. 

Proteolytic inactivation of enzymes, leading to irreversible 
loss of in vivo catalytic activity, is also a physiologically signif- 
icant event. Several enzymes are known to be inactivated in 
response to physiological or developmental changes or after a 
metabolic shift. Proteinases A and B from yeast inactivate 
several enzymes in a two-step process involving covalent mod- 
ification of proteins as a marking mechanism for proteolysis. 

Proteolytic modification of enzymes is known to result in a 
protein with altered physiological function; e.g., leucyl-L-RNA 
synthetase from E. coli is converted into an enzyme which 
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catalyzes leucine-dependent pyrophosphate exchange by re- 
moval of a small peptide from the native enzyme. 

Nutrition 

Proteases assist the hydrolysis of large polypeptides into 
smaller peptides and amino acids, thus facilitating their ab- 
sorption by the cell. The extracellular enzymes play a major 
role in nutrition due to their depolymerizing activity. The mi- 
crobial enzymes and the mammalian extracellular enzymes 
such as those secreted by pancreas are primarily involved in 
keeping the cells alive by providing them with the necessary 
amino acid pool as nutrition. 

Regulation of Gene Expression 

Modulation of gene expression mediated by protease has 
been demonstrated (241). Proteolysis of a repressor by an 
ATP-requiring protease resulted in a derepression of the gene. 
A change in the transcriptional specificity of the B subunit of 
Bacillus thuringiensls RNA polymerase was correlated with its 
proteolytic modification (154). Modification of ribosomal pro- 
teins by proteases has been suggested to be responsible for the 
regulation of translation (128). 

Besides the general functions described so far, the proteases 
also mediate the degradation of a variety of regulatory proteins 
that control the heat shock response, the SOS response to 
DNA damage, the life cycle of bacteriophage (75), and pro- 
grammed bacterial cell death (303). Recently, a new physio- 
logical function has been attributed to the ATP-dependent 
proteases conserved between bacteria and eukaryotes. It is 
believed that they act as chaperones and mediate not only 
proteolysis but also the insertion of proteins into membranes 
and the disassembly or oligomerization of protein complexes 
(275). In addition to the multitude of activities that are already 
assigned to proteases, many more new functions are likely to 
emerge in the near future. 

APPLICATIONS OF PROTEASES 

Proteases have a large variety of applications, mainly in the 
detergent and food industries. In view of the recent trend of 
developing environmentally friendly technologies, proteases 
are envisaged to have extensive applications in leather treat- 
ment and in several bioremediation processes. The worldwide 
requirement for enzymes for individual applications varies 
considerably. Proteases are used extensively in the pharmaceu- 
tical industry for preparation of medicines such as ointments 
for debridement of wounds, etc. Proteases that are used in the 
food and detergent industries are prepared in bulk quantities 
and used as crude preparations, whereas those that are used in 
medicine are produced in small amounts but require extensive 
purification before they can be used. 

Detergents 

Proteases are one of the standard ingredients of all kinds of 
detergents ranging from those used for household laundering 
to reagents used for cleaning contact lenses or dentures. The 
use of proteases in laundry detergents accounts for approxi- 
mately 25% of the total worldwide sales of enzymes. The prep- 
aration of the first enzymatic detergent, "Burnus," dates back 
to 1913; it consisted of sodium carbonate and a crude pancre- 
atic extract. The first detergent containing the bacterial en- 
zyme was introduced in 1956 under the trade name BIO-40. In 
1960, Novo Industry A/S introduced alcalase, produced by 
Bacillus lichen iformLs; its commercial name was BIOTEX. This 
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was followed by Maxatase, a detergent made by Gist-Brocades. 
The biggest market for detergents is in the laundry industry, 
amounting to a worldwide production of 13 billion tons per 
year. The ideal detergent protease should possess broad sub- 
strate specificity to facilitate the removal of a large variety of 
stains due to food, blood, and other body secretions. Activity 
and stability at high pH and temperature and compatibility 
with other chelating and oxidizing agents added to the deter- 
gents are among the major prerequisites for the use of pro- 
teases in detergents. The key parameter for the best perfor- 
mance of a protease in a detergent is its pi. It is known that a 
protease is most suitable for this application if its pi coincides 
with the pH of the detergent solution. Esperase and Savinase 
T (Novo Industry), produced by alkalophilic Bacillus spp., are 
two commercial preparations with very high isoelectric points 
(pi 11); hence, they can withstand higher pH ranges. Due to 
the present energy crisis and the awareness for energy conser- 
vation, it is desirable to use proteases that are active at lower 
temperatures. A combination of lipase, amylase, and cellulase 
is expected to enhance the performance of protease in laundry 
detergents. 

All detergent proteases currently used in the market are 
serine proteases produced by Bacillus strains. Fungal alkaline 
proteases are advantageous due to the ease of downstream 
processing to prepare a microbe-free enzyme. An alkaline pro- 
tease from Conidiobolus corona tus was found to be compatible 
with commercial detergents used in India (219) and retained 
43% of its activity at 50°C for 50 min in the presence of Ca""^ 
(25 mM) and glycine (1 M) (16). 

Leather Industry 

Leather processing involves several steps such as soaking, 
dehairing, bating, and tanning. The major building blocks of 
skin and hair are proteinaceous. The conventional methods of 
leather processing involve hazardous chemicals such as sodium 
sulfide, which create problems of pollution and effluent dis- 
posal. The use of enzymes as alternatives to chemicals has 
proved successful in improving leather quality and in reducing 
environmental pollution. Proteases are used for selective hy- 
drolysis of noncollagenous constituents of the skin and for 
removal of nonfibrillar proteins such as albumins and globu- 
lins. The purpose of soaking is to swell the hide. Traditionally, 
this step was performed with alkali. Currently, microbial alka- 
line proteases are used to ensure faster absorption of water 
and to reduce the time required for soaking. The use of non- 
ionic and, to some extent, anionic surfactants is compatible 
with the use of enzymes. The conventional method of dehair- 
ing and dewooling consists of development of an extremely 
alkaline condition followed by treatment with sulfide to solu- 
bilize the proteins of the hair root. At present, alkaline pro- 
teases with hydrated lime and sodium chloride are used for 
dehairing, resulting in a significant reduction in the amount of 
wastewater generated. Earlier methods of bating were based 
on the use of animal feces as the source of proteases; these 
methods were unpleasant and unreliable and were replaced by 
methods involving pancreatic trypsin. Currently, trypsin is used 
in combination with other Bacillus ^nd Aspei-gillus proteases for 
bating. The selection of the enzyme depends on its specificity 
for matrix proteins such as elastin and keratin, and the amount 
of enzyme needed depends on the type of leather (soft or hard) 
to be produced. Increased usage of enzymes for dehairing and 
bating not only prevents pollution problems but also is effective 
in saving energy. Novo Nordisk manufactures three different 
proteases, Aquaderm, NUE, and Pyrase, for use in soaking, 
dehairing, and bating, respectively. 
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Food Industry 

The use of proteases in the food industry dates back to 
antiquity. They have been routinely used for various purposes 
such as cheesemaking, baking, preparation of soya hydroly- 
sates, and meat tenderization. 

Dairy industry. The major application of proteases in the 
dairy industry is in the manufacture of cheese. The milk-coag- 
ulating enzymes fall into three main categories, (i) animal 
rennets, (ii) microbial milk coagulants, and (iii) genetically 
engineered chymosin. Both animal and microbial milk-coagu- 
lating proteases belong to a class of acid aspartate proteases 
and have molecular weights between 30,000 to 40,000. Rennet 
extracted from the fourth stomach of unweaned calves con- 
tains the highest ratio of chymosin (EC 3.4.23,4) to pepsin 
activity. A world shortage of calf rennet due to the increased 
demand for cheese production has intensified the search for 
alternative microbial milk coagulants. The microbial enzymes 
exhibited two major drawbacks, i.e., (i) the presence of high 
levels of nonspecific and heat-stable proteases, which led to the 
development of bitterness in cheese after storage; and (ii) a 
poor yield. Extensive research in this area has resulted in the 
production of enzymes that are completely inactivated at nor- 
mal pasteurization temperatures and contain very low levels of 
nonspecific proteases. In cheesemaking, the primary function 
of proteases is to hydrolyze the specific peptide bond (the 
Phel05-Metl06 bond) to generate para-K-CdScin and mac- 
ropeptides. Chymosin is preferred due to its high specificity for 
casein, which is responsible for its excellent performance in 
cheesemaking. The proteases produced by GRAS (genetically 
regarded as safe) -cleared microbes such as Mucor michei, Ba- 
cillus subtil is, and Endotliia parasitica are gradually replacing 
chymosin in cheesemaking. In 1988, chymosin produced 
through recombinant DNA technology was first introduced to 
cheesemakers for evaluation. Genencor International in- 
creased the production of chymosin in Aspergillus niger var. 
awamori to commercial levels. At present, their three recom- 
binant chymosin products are available and are awaiting leg- 
islative approval for their use in cheesemaking (72). 

Whey is a by-product of cheese manufacture. It contains 
lactose, proteins, minerals, and lactic acid. The insoluble heat- 
denatured whey protein is solubilized by treatment with im- 
mobilized trypsin. 

Baking industry. Wheat flour is a major component of bak- 
ing processes. It contains an insoluble protein called gluten, 
which determines the properties of the bakery doughs. Endo- 
and exoproteinases from Aspergillus oryzae have been used to 
modify wheat gluten by limited proteolysis. Enzymatic treat- 
ment of the dough facilitates its handling and machining and 
permits the production of a wider range of products. The 
addition of proteases reduces the mixing time and results in 
increased loaf volumes. Bacterial proteases are used to im- 
prove the extensibility and strength of the dough. 

Manufacture of soy products. Soybeans serve as a rich 
source of food, due to their high content of good-quality pro- 
tein. Proteases have been used from ancient times to prepare 
soy sauce and other soy products. The alkaline and neutral 
proteases of fungal origin play an important role in the pro- 
cessing of soy sauce. Proteolytic modification of soy proteins 
helps to improve their functional properties. Treatment of soy 
proteins with alcalase at pH 8 results in soluble hydrolysates 
with high solubility, good protein yield, and low bitterness. The 
hydrolysate is used in protein-fortified soft drinks and in the 
formulation of dietetic feeds. 

Debittering of protein hydrolysates. Protein hydrolysates 
have several applications, e.g., as constituents of dietetic and 
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health products, in infant formulae and clinical nutrition sup- 
plements, and as flavoring agents. The bitter taste of protein 
hydrolysates is a major barrier to their use in food and health 
care products. The intensity of the bitterness is proportional to 
the number of hydrophobic amino acids in the hydrolysate. 
The presence of a proline residue in the center of the peptide 
also contributes to the bitterness. The peptidases that can 
cleave hydrophobic amino acids and proline are valuable in 
debittering protein hydrolysates. Aminopeptidases from lactic 
acid bacteria are available under the trade name Debitrase. 
Carboxypeptidase A has a high specificity for hydrophobic 
amino acids and hence has a great potential for debittering. A 
careful combination of an endoprotease for the primary hy- 
drolysis and an aminopeptidase for the secondary hydrolysis is 
required for the production of a functional hydrolysate with 
reduced bitterness. 

Synthesis of aspartame. The use of aspartame as a noncalo- 
rific artificial sweetener has been approved by the Food and 
Drug Administration. Aspartame is a dipeptide composed of 
L-aspartic acid and the methyl ester of L-phenylalanine. The L 
configuration of the two amino acids is responsible for the 
sweet taste of aspartame. Maintenance of the stereospecificity 
is crucial, but it adds to the cost of production by chemical 
methods. Enzymatic synthesis of aspartame is therefore pre- 
ferred. Although proteases are generally regarded as hydro- 
lytic enzymes, they catalyze the reverse reaction under certain 
kinetically controlled conditions. An immobilized preparation 
of thermolysin from Bacillus thennoprotyolyticus is used for the 
enzymatic synthesis of aspartame. Toya Soda (Japan) and 
DSM (The Netherlands) are the major industrial producers of 
aspartame. 

Pharmaceutical Industry 

The wide diversity and specificity of proteases are used to 
great advantage in developing effective therapeutic agents. 
Oral administration of proteases from Aspergillus oryzae (Lu- 
izym and Nortase) has been used as a digestive aid to correct 
certain lytic enzyme deficiency syndromes. Clostridial collage- 
nase or subtilisin is used in combination with broad-spectrum 
antibiotics in the treatment of burns and wounds. An aspargi- 
nase isolated from E. coli is used to eliminate aspargine from 
the bloodstream in the various forms of lymphocytic leukemia. 
Alkaline protease from Conidioholus coronatus was found to 
be able to replace trypsin in animal cell cultures (36). 

Other Applications 

Besides their industrial and medicinal applications, pro- 
teases play an important role in basic research. Their selective 
peptide bond cleavage is used in the elucidation of structure- 
function relationship, in the synthesis of peptides, and in the 
sequencing of proteins. 

In essence, the wide specificity of the hydrolytic action of 
proteases finds an extensive application in the food, detergent, 
leather, and pharmaceutical industries, as well as in the struc- 
tural elucidation of proteins, whereas their synthetic capacities 
are used for the synthesis of proteins. 

GENETIC ENGINEERING OF MICROBIAL PROTEASES 

Gene cloning is a rapidly progressing technology that has 
been instrumental in improving our understanding of the struc- 
ture-function relationship of genetic systems. It provides an 
excellent method for the manipulation and control of genes. 
More than 50% of the industrially important enzymes are now 
produced from genetically engineered microorganisms (96). 
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TABLE 4. Cloning, sequencing, and/or expression of protease genes or cDNAs from microbial sources 



Source of protease gene R e fere nee (s) 



Bacteria 
Bacilli 
B. siihtilis 168 

apr. 270 

npr 90, 295, 297, 323 

epr 27, 263 

bpr. 265 

mpr 264 

lsp-1 138 

S. snbtilis (Natto) 16 319 

R mhtm N 515-N {nprX) 157 

Alkalophilic Bacillus strain 129 

B. alkalophilus PB92 300 

Bacillus sp. strain Y 293 

Alkalophilic Bacillus sp. NKS-21 318 

Alkalophilic Bacillus sp. LG-12 251 

Bacillus sp. EA (Npr) 249 

Lactococci 

Streptococcus cremoris V^gl 139-141 

Lactococcus lactis subsp. cremoris H2 317 

Streptococcus lactis NCDO 763 137 

L. lactis subsp. cremoris SKI 1 50 

L. lactis subsp. lactis VC 317 153 

L. lactis subsp. cremoris Wg2 58 

Lactohacillus delbruckii subsp. bulgaricus 69 

Streptomyces 

S. griseus 89 

5. griseus ATCC 10137.. 107 

S. cacaoi YM15 32 

S.fradiae ATCC 14544 135 

S. lividaus 66 17, 18, 163 

Serratia 

Sen-atia sp. strain E-15 198 

S. marcescens SM6 24 

S. marcescens 187 

S. marcescens ATCC 27117 134 

Pseudomonas 

P. aeruginosa IFO 3455 7, 254 

P. aeruginosa PAOl 83 

P. aeruginosa 82 

P, nalgiovense 68 

Aeromonas 

A. hydrophila SO 212 238, 239 

A. hydrophila D13 238 

Vibrio 

V. anguillarwn NBIO 185 

K parahaem olyticus 155 

K vulnificus 34 

V. proteolyticus 43 

K angionolyticus 45 

K cholerae 86 

E. coli 

Membrane proteases 

IspA, lep 42, 333 

sppA 108, 276 

ompT 80 

ATP-dependent proteases 

La/Lon 3, 35, 334 

Clp/Ti 181 

Miscellaneous 
Lysohacter enzymogenes 495 260 



Source of protease gene Reference(s) 



L. enzymogenes 57 

Achromobacter lyticus M 497-1 208 

A. lyticus 169 

Envinia sp 2, 307 

Rhodocyclus gilatinosa APR 3-2 116 

Bacteroids nodosus 1 94 

Xanthomonas campestris pv. campestris 168 

Treponema denticola ATCC 33520 176 

Staphylococcus aureus V8 29 

Thermos aquatic us YT-1 148 

Thermomonospora fusca YX 152 

Alteromonas sp. strain 0-7 298, 299 

IgA family of proteases 

N. gonoirhoeae 62, 224, 232 

N. meningitidis 169 

H. influenzae 228 

Streptococcus sanguis ATCC 10556 70 

S. pneumoniae 229, 308 

Fungi 

Filamentous fungi 
Acidic proteases 

Mucor pusillus rennin (MPR) 94, 296 

Mucor miehei aspartyl protease (MMAP) 51, 79 

R. niveus aspartic protease (RNAP) 100, 101 

A, awamori aspergillopepsin A 15 

A, oryzae aspergillopepsin A 74 

A. fumigatus aspergillopepsin F 156, 237 

A. oryzae M-9 284 

A. satoiATCC 14332 257 

A. niger var. macrosporus 

Proctase A 114, 125 , 283 

Proctase B 113, 175 

Alkaline proteases (Alp) 
Aspergillus 

A. oryzae ATCC 20386 195, 207, 286, 288 

A. oryzae Thailand industrial strain 33 

A. soya 21 1 

A. fumigatus 123, 237 

A. flavus 233 

A. nidulans 131 

Acremoniimi 

A. chrysogenum ATCC 1 1550 1 15 

Fusarium 136, 193 

Serine proteases 

Tritirachiurn album Limber 

Proteinase K 81 

Proteinase T 247 

Metalloproteases 

A, fumigatus MEP 124, 262 

A. flavus MEP-20 234 

A. fumigatus MEP-20 234 

Yeasts 

Acidic proteases 

S. fibuligera (PEPl) 95, 320 

5. cerevisiae (PEP4) 4, 31.5 

S. cerevisiae (BARl) 177 

S. cerevisiae (YAP3) 53 

C albicans (SAP) 106, 170, 196 

C. tropicalis (ACP) 294 

K lipolytica 148 (AXP) 331 

Wild-type yeast 332 

Alkaline proteases 

Y, lipolytica (AEP) XRP2 44, 201 

Serine proteases 

Khiyveromyces lactis KEX-l 285 

S. cerevisiae KEX-2 188, 212 
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TABLE A — Continued 



Source of protease gene Reference(s) 



Other proteases 

Yeast carboxypeptidase (CPY) 

5. cerevisiae PRCl 202 

Vacoular protease B 

5. cerevisiae PRBl 190 

Yeast proteasome PRGl 65 

Viruses 
Animal viruses 

Herpesviruses 

HSV-1 47 

HSV-2 271 

MCMV 172 

HHV-6 292 

Adenoviruses 

Ad4 102 

Adl2 103 

Ad3 104 

Ad40 306 

Ad41 306 

Retroviruses 

RSV 252 

ASLV 144 

ARV-2 248 

M-MuLv 256 

SRV-1 231 

HTLV-2 255 

BLV 245 

M-PMV 105, 268 

SIVmac 40 

ARV 144 

HTLV-1 246 

HlV-1 46, 78, 171, 222 

Picornaviruses 

Human rhinovirus type 14 162 

Foot-and-mouth disease virus 6 

Encephalomyocarditis virus 6 

PoUovirus 6 

Plant viruses 

Bean yellow mosaic virus 22 

Zucchini yellow mosaic virus 

(Singapore isolate) 316 



Several reports have been published in the past decade (Table 
4) on the isolation and manipulation of microbial protease 
genes with the aim of (i) enzyme overproduction by the gene 
dosage effect, (ii) studying the primary structure of the protein 
and its role in the pathogenicity of the secreting microorgan- 
ism, and (iii) protein engineering to locate the active-site res- 
idues and/or to alter the enzyme properties to suit its commer- 
cial applications. Protease genes from bacteria, fungi, and 
viruses have been cloned and sequenced (Table 4). 

Bacteria 

The objective of cloning bacterial protease genes has been 
mainly the overproduction of enzymes for various commercial 
applications in the food, detergent and pharmaceutical indus- 
tries. The virulence of several bacteria is related to the secre- 
tion of several extracellular proteases. Gene cloning in these 
microbes was studied to understand the basis of their patho- 
genicity and to develop therapeutics against them. Proteases 
play an important role in cell physiology, and protease gene 



cloning, especially in E. coli, has been attempted to study the 
regulatory aspects of proteases. 

Bacilli, (i) B. suhtilis as a host for cloning of protease genes 
from Bacillus spp. The ability of B. subtilis to secrete various 
proteins into the culture medium and its lack of pathogenicity 
make it a potential host for the production of foreign polypep- 
tides by recombinant DNA technology. Several Bacillus spp. 
secrete two major types of protease, a subtilisin or alkaline 
protease and a metalloprotease or neutral protease, which are 
of industrial importance. Studies of these extracellular pro- 
teases are significant not only from the point of view of over- 
production but also for understanding their mechanism of se- 
cretion. Table 5 describes the cloning of genes for several 
neutral {npr) and alkaline (apr) proteases from various bacilli 
into B. suhtilLs. 

(ii) B, subtilis. R subtilis 168 secretes at least six extracellular 
proteases into the culture medium at the end of the exponen- 
tial phase. The structural genes encoding the alkaline protease 
(apr) or subtilisin (270), neutral protease A and B {nprA and 
nprB) (90, 297, 323), minor extracellular protease (epr) (27, 
263), bacillopeptidase F (bpr) (265), and metalloprotease 
(mpr) (264) have been cloned and characterized. These pro- 
teases are synthesized in the form of a "prepro" enzyme. To 
increase the expression of subtilisin and neutral proteases, 
Henner et al. replaced the natural promoters of apr and npr 
genes with the amylase promoter from B. amyloliquefaciens 
and the neutral protease promoter from B. subtilis ^ respectively 
(90). To understand the regulation of npr A gene expression, 
Toma et al. cloned the genes from B, subtilis 168 (normal 
producer) and Base 1A341 (overproduce r) (295). The two 
genes were found to be highly homologous except for a stretch 
of 66 bp close to the promoter region, which is absent in the 
Base 1A341 gene. The epr gene shows partial homology to the 
apr gene and to the major intracellular serine protease (Isp-1) 
gene of B, subtilis (138). The epr gene was mapped at a locus 
different from the apr and npr loci on the B. suhtilis chromo- 
some and was shown not to be required for growth or sporu- 
lation, similar to apr or npr genes. Deletion of 240 amino acids 
(aa) from the C-terminal region of the epr gene product did not 
abolish the enzyme activity (27, 263). The deduced amino acid 
sequence of the mature bpr gene product is similar to those of 
other serine proteases of B. subtilis, i.e., subtilisin, Isp-1, and 
Epr. B. subtilis strains containing mutations in five extracellular 
protease genes (apr, npr, epr, mpr, and bpr) have been con- 
structed (264) with the aim of expressing heterologous gene 
products in B. subtilis. The total amino acid sequence of B. 
subtilis lsp-1 deduced from the nucleotide sequence showed 
considerable homology (45%) to subtilisin. Highly conserved 
sequences are present around the essential amino acids, Ser, 
His, and Asp, indicating that the genes for both the intra- and 
extracellular serine proteases have a common ancestor. 

In 1995, Yamagata et al. cloned and sequenced a 90-kDa 
serine protease gene (lispK) from B. subtilis (Natto) 16 (319). 
The large size of the enzyme may represent an ancient form of 
bacterial serine protease. 

Analysis of DNA sequences of subtilisin BPN' from B. 
amyloliquefaciens (304, 313) and subtilisin Carlsberg from B. 
liclieniformis (119) revealed that the two sequences are highly 
conserved in the coding region for the mature protein and 
must therefore have a common ancestral precursor. Yoshi- 
moto et al. characterized the gene encoding subtilisin amy- 
losacchariticus from B. subtilis subsp. amylosacchariticus (327, 
328). The sequence was highly homologous to that of subtilisin 
E from B. subtilis 168 (269). The gene was ex-pressed in B. 
subtilis ISW 1214 by using the vector pHY300PLK, with 20- 
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TABLE 5. Cloning of protease genes in B. suhtilis 


OkJUl \Jl IJl VlVtlo^O 


I'ype of proteose 


Expression 
(fold) 


Characterization 
of gene 


Reference(s) 


R amyloliquefaciens F 


Neutral 


50 


a 


99, 178 


stearothemiophihis F TELNE 


Neutral 


5 


nprS sequenced; 

homologous to nprM 


203 


B. amyloliquefaciens IFO 14141 


Neutral 


15 


Partially sequenced 


133, 330 


B, cereus 


Neutral (metalloprotease) 


J' 




1 


B. stearothermophilus CU21 


Thermostable, neutral 


/ 


nprT sequenced 


67, 279 


B. stearothermopiiihis 313-1 


Thermostable, neutral 


29 




324 


B. stearothennopmlns HY-69 


Thermostable, neutral 


y 




IOC 


B. steorothennophiliis MK232 and 


Highly thermostable, neutral 




nprM sequenced; 


I H J , 1 HO 


YGIS5-hyperproducing mutant of 






deduced amino acid 




MK232 






sequence 
homologous to 
thermolysin (B. 
ihermoproteolyticus) 
except for two 
substitutions, Asp37 
to Asn37 and 
Glull9 to Glnll9 




B. stearothermophilus 


Thermostable, metalloprotease 


y 




258 


B. brevis 


Metalloprotease 




y 


9 


B. licheniformis 


Alkaline and neutral 






209 


B. a rnyloliqiiefaciens 


Alkaline and neutral 




opr, npr sequenced 


304 


B. licheniformis 


Alkaline 


y 




88 


B. pumihis IFO 12092 


Alkaline 


y 




289 


B. amyloliquefaciens 


Subtilisin 


y 


y 


313 


B. natto 




350 




197 


B. licheniformis ATCC 14580 


C-terminal glutamic acid specific 
(BLase) 


y 




127 



" — , no data is available. 
J, expression of the gene was obser\'ed. 



fold-higher activity than that of the host and 4-fold-higher 
activity than that of B. subtilis subsp. amylosacchariticus. 

(Hi) Alkalophilic Bacillus spp. Bacillus proteases with an 
extremely alkaline pH optimum are generally used in deter- 
gent powders and are preferred over the subtilisins (optimal 
pH, 8.5 to 10.0), The information on these enzymes is helpful 
in designing new subtilisins. Kaneko et al. cloned and se- 
quenced the ale gene, encoding alkaline elastase YaB, a new 
subtilisin from an alkalophilic Bacillus strain (129). The de- 
duced amino acid sequence showed 55% homology to subtili- 
sin BPN'. Almost all the positively charged residues have been 
predicted to be present on the surface of the alkaline elastase 
YaB molecule, facilitating its binding to elastin. The deduced 
amino acid sequence of the highly alkaline serine protease 
from another alkalophilic strain, B. alcalophilus PB92, showed 
considerable homology to YaB (300). The cloned gene was 
further used to increase the production level of the protease by 
gene amplification through chromosomal integration. In- 
creased enzyme production and gene stabilization was ob- 
sei'ved when nontandem duplication occurred. 

A gene encoding lSP-1 was characterized from alkalophilic 
Bacillus sp. strain NKS-21 (318). The nucleotide sequence was 
50% homologous to genes encoding lSP-1 from B. subtilis, B. 
polymyxa, and the alkalophilic Bacillus sp. strain 221. 

(iv) Other bacilli. A gene encoding the highly thermostable 
neutral proteinase (Npr) from Bacillus sp. strain EAl was 
shown to be closely related to an npr gene from B. caldolyticus 
YP-T, except for a single-amino-acid change in the gene prod- 
uct (249). The enzyme from Bacillus sp. strain EAl was more 
thermostable than the enzyme from B. caldolyticus YP-T; this 
can be attributed to the single-amino-acid change. 



Lactococci. Lactococci {Lactococcus lactis subsp. lactis and 
cremoriSj previously Streptococcus lactis and Streptococcus ere- 
moris, respectively), the dairy starter cultures, have a complex 
proteolytic system which enables them to grow in milk by 
degrading casein into small peptides and free amino acids. This 
leads to the development of the texture and flavor of various 
dairy products. The importance of the cell envelope-located 
proteolytic system for dairy product quality has resulted in an 
increased fundamental research of the involved enzymes and 
their genes. On the basis of differences in caseinolytic speci- 
ficity, the lactococcal proteases have been classified into two 
main groups: the Pl-type protease, which degrades predomi- 
nantly p-casein, and the Plll-type protease which degrades 
aSl-, p-, and K-casein (305). Most of the genetic studies have 
focused on the Pl-type protease genes. Lactococcal protease 
genes are located mostly on plasmids, which differ considerably 
in size and genetic organization in different strains (49). Curing 
experiments have suggested that plasmid pWV05 of S. cremoris 
Wg2 specifies proteolytic activity. The entire plasmid was sub- 
cloned in E. coll (140), A 4.3-MDa HMlll fragment of the 
plasmid, specifying the proteolytic activity, was cloned in B, 
subtilis and in a protease-deficient S. lactis strain. In S. lactis, 
the recombinant plasmid enabled the cells to grow normally in 
milk with rapid acid production. The HindWl fragment speci- 
fying the proteolytic activity of S. cremons Wg2 was fully se- 
quenced (141), The nucleotide sequence revealed two open 
reading frames (ORFs), ORF-1, a small ORF containing 295 
codons, and ORF-2, a large ORF containing 1,772 codons. The 
protein specified by ORF-2 contained regions of extensive 
homology to subtilisins. The amino acids Asp32, His64, and 
Ser221, involved in the formation of the active site, were well 
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conserved. Deletion analysis of the proteinase gene of S. ere- 
maris Wg2 showed that deletion of the C-terminal 343 aa did 
not influence the enzyme specificity of p-casein degradation 
(139). L. lacm subsp, cremoris H2 carries plasmid pDI21, con- 
taining the gene for the protease-positive phenotype (Prt^). 
The 6.5-kbp HiniWll DNA fragment of pDI21 encoding the 
protease was cloned in E. coli as well as in L. lactis subsp. lactls 
4125 (317). Protease that specifically degrades p-casein was 
expressed in both the transformed organisms. S. lactis NCDO 
763 harbors plasmid pLP763, containing the gene for Prt"*", 
which enables it to grow to a higher density in milk. The 
deduced amino acid sequence (1,902 aa) of the Prt"^ phenotype 
was homologous to that of the serine protease from S. cremoris 
Wg2, suggesting that the genes encoding both products must 
have been derived from a common ancestral gene (137). 

The Plll-type protease is found only in L. lactis subsp, cre- 
moris AMI and SKll. These strains are related, and they both 
contain the proteases encoded by the 78-kbp plasmid psklll. 
The L. lactis subsp. cremoris SKll prtP gene was cloned and 
expressed in E, coli as well as in other subspecies of L. lactis 
(50). The location and orientation of the prtP gene on psklll 
was determined by deletion analysis. A region at the C termi- 
nus of the prtP product, which is involved in cell envelope 
attachment, was identified. A deletion derivative of prtP spec- 
ifying a C-terminally truncated protease was able to express 
and fully secrete the protease in the medium and showed the 
capacity to degrade aSl-, p-, and K-casein. The N-terminal 
catalytic domain of the matrix enzyme shows significant se- 
quence homology to the serine proteases of the subtilisin fam- 
ily (subtilases). Comparison with the known sequences of prt 
genes from L. lactis SKll, Wg2, and NCDO 763 indicated that 
the VC317 protease (153) is a natural hybrid of the SKll and 
Wg2 proteases. 

Stabilization of lactococcal protease genes (prtP, encoding 
the cell envelope-associated serine protease, and pnM, which 
activates the prtP gene product) is essential for the dairy in- 
dustry. The plasmid-located prtP and prtM genes of L. lactis 
subsp. cremoris Wg2 were integrated (Campbell-like integra- 
tion) into the L. lactis subsp. lactis MG1363 chromosome by 
using the insertion vector pKL9610 (158). Two transform ants, 
MG610 and MG611, carrying different numbers (two and 
eight, respectively) of stable tandemly integrated plasmid cop- 
ies, were obtained. Strain MG611 produced 11 times as much 
protease activity as did strain MG610 and about 1.5 times as 
much as did strain MG1363 (carrying five copies of the auton- 
omously replicating plasmid). 

A pi as mid-free strain, L. lactis subsp. cremoris BClOl, pro- 
duces cell envelope-associated protease that is very similar or 
identical to the envelope protease encoded by the plasmid- 
1 inked prtP gene in other strains such as Wg2 and SKll. The 
prtP and prtM genes in this plasmid-free strain were identified 
on chromosomal DNA by pulsed-field gel electrophoresis 
(204). The chromosomal protease gene was shown to be orga- 
nized in a fashion similar to that of the plasmid-linked protease 
gene. Recently, Gilbert et al. cloned and sequenced the prtB 
chromosomal gene from Lactobacillus delbrueckii subsp. bul- 
garicus, encoding a protease of 1,946 residues with a predicted 
molecular mass of 212 kDa (69). The deduced amino acid 
sequence showed significant homology to the N-terminal and 
catalytic domains of lactococcal PrtP cell surface proteases. 

Streptomyces, Streptomyces griseus , an organism used for the 
commercial production of pronase, secretes two extracellular 
serine proteases: proteases A and B. The enzymes are 61% 
homologous on the basis of amino acid identity. The genes 
encoding protease A (sprA) and protease B {sprB) were iso- 
lated from the S, griseus genomic library, and their proteolytic 



activity was demonstrated in S. lividans (89). The DNA se- 
quences suggest that each protease is initially secreted as a 
precursor, which is then processed to remove an N-terminal 
propeptide from the mature protease. The strong homology 
between the coding regions of the two protease genes suggests 
that sprA and sprB must have originated by gene duplication. 
Protease B is one of the major extracellular proteases secreted 
by S. griseus ATCC 10137, and its gene was expressed in 5. 
lividans by Hwang et al. (107). Their nucleotide sequencing of 
the gene further revealed that the deduced amino acid se- 
quence was identical to that reported earlier by Henderson et 
al. (89). However, the nucleotide sequence of the 3'-flanking 
region was G rich and may be responsible for the reduced level 
of protease in S. griseus ATCC 10137 compared to the level in 
protease B-overproducing strains of S. griseus. 

The npr gene for neutral metalloprotease from S. cacaoi 
YM15 was expressed in 5. lividans (32). The deduced ORF 
encoded a 550-aa (60-kDa) protein, whereas the Npr secreted 
into the medium is 35 kDa, suggesting that it has undergone 
substantial processing since separating from the precursor, 

5. fradiae ATCC 14544 secretes a novel acidic-ami no-acid- 
specific serine protease (SFase) into the culture medium. The 
deduced amino acid (135) sequence revealed a mature protein 
of 187 aa and shows 82% homology to the acidic-amino-acid- 
specific protease from S. griseus {111). Genes coding for a 
novel protease (163), a chymotrypsin-like serine protease 
(SAM-P20) (17), and SlpD and SlpE (homologs of the Tap 
[major tripeptidyl aminopeptidase] mycelium-associated pro- 
teases) (18) were cloned from S. lividans 66. 

Serratia. The gram -negative bacteria belonging to the family 
Enterobacteriaceae are known to secrete large amounts of ex- 
tracellular proteases into the surrounding medium. Serratia sp. 
strain E-15 produces a potent extracellular metalloprotease, 
which is widely used as an anti-inflammatory agent. The gene 
encoding the protease from Serratia sp. strain E-15 was ex- 
pressed both in £. coli and in 5. marcescens (198). Nucleotide 
sequence analysis revealed three zinc ligands (essential for 
proteolytic activity) and an active site, as predicted by compar- 
ing the deduced amino acid sequence with that of B, thermo- 
proteolyticus thermolysin and R subtil is neutral protease. 

In another study, the extracellular serine protease (SSP) of 
5. marcescens was excreted through the outer membrane of E. 
coli. The nucleotide sequence of the cloned SSP gene, together 
with the determination of the N and C termini of the excreted 
enzymes, suggested that this protease is produced as a 112-kDa 
preproenzyme composed of an N-terminal signal sequence, the 
mature protease, and a large C-terminal domain (187). 

Pseudomonas. Pseudomonas aeruginosa is an opportunistic 
pathogen and can cause fatal infections in compromised hosts. 
This virulence is related to the secretion of several extracellu- 
lar proteins (167). P. aeruginosa secretes two proteases, an 
alkaline protease and an elastase. The alkaline protease genes 
(apr) from P. aeruginosa IFO 3455 and PAOl were cloned in E. 
coli (7, 83, 254). The DNA fragment (8.8 kbp) coding for the 
alkaline protease from strain PAOl was expressed in E. coli 
under the control of a tac promoter. Active enzyme was found 
to be synthesized and secreted into the medium in the absence 
of cell lysis. 

The LasA protease (elastin degrading) of P. aeruginosa is 
also an important contributor to the pathogenesis of this bac- 
terium. The enzyme shows a high level of staphylolytic activity. 
The lasA gene from strain FRDl was overexpressed in E, coli 
(82). It encodes a precursor, prepro-LasA, of about 45 kDa. 
N-terminal sequence analysis allowed the identification of a 
31-aa signal peptide. pro-LasA (42 kDa) does not undergo 
autoproteolytic processing and possesses little anti-staphylo- 
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coccal activity. The digestion of pro-LasA either by trypsin or 
by culture filtrate of the P. aeruginosa lasA deletion mutant 
yielded the active (20-kDa) staphylolytic protease. 

Aeromonas. Aeromorias hydrophila and the related aero- 
monads are opportunistic pathogens of humans and fish. The 
pathogenicity of the microbe may involve several extracellular 
enzymes, and it has been suggested that the proteases excreted 
by Aeromonas spp. play an important role in invasiveness and 
in establishment of the infection. Two distinct types of extra- 
cellular proteases, a temperature-stable metalloprotease and a 
temperature-labile serine protease, are found in various strains 
oiA. hydrophila and other aeromonads (160). Structural genes 
encoding extracellular proteases from two different A. hy- 
drophila strains, S02/2 and D13, were cloned in E, coll C600-:l 
by using pBR322 (238). A temperature-stable protease is se- 
creted into the periplasm of coli and exhibits properties 
identical to those of the protease purified from A. hydrophila 
S02/2 culture supernatant. A gene for the temperature-labile 
serine protease was also expressed from A. hydrophila S02/2 
into £. coll C600-1 and 5. lividans 1326 (239). 

Vibrio, To facilitate genetic analyses of the role of proteases 
in the pathogenesis of various Vibrio species, the genes encod- 
ing the Zn^*^ -metalloprotease from K angidllarum NB 10 
(185), V. parahaemolyticiis (155), and V. vulnificus (34) were 
cloned and sequenced. The conserved Zn^'^-binding domains 
were identified by measuring homology to other metallopro- 
teases. The nucleotide sequence of the AzprK gene encoding the 
extracellular neutral protease, vibriolysin (NprV), of V, proteo- 
lyticus revealed an ORF encoding 609 aa including a putative 
signal peptide sequence followed by a long prosequence of 172 
aa (43). Comparative analysis of the mature NprV with the 
sequences of the neutral proteases from bacilli revealed exten- 
sive regions of conserved amino acid homology with respect to 
the active site and zinc- and calcium-binding residues. NprV 
was overproduced in B. subtilis by placing the DNA encoding 
the pro-NprV and the mature NprV downstream of the Bacil- 
lus promoter and signal sequences. 

In one of the studies, the nucleotide sequence analysis of the 
structural gene, hap, for the extracellular haem agglutinin pro- 
tease of V. cholerae revealed that the enzyme is produced as a 
large precursor, with the amino-terminal signal sequence fol- 
lowing a propeptide (86). The deduced amino acid sequence of 
the mature enzyme showed 61.5% identity to the P. aeruginosa 
elastase. 

E. coli, (i) Membrane proteases. In a bacterium, a protein 
that is to be exported across the cytoplasmic membrane is 
synthesized as a large precursor with a signal peptide at its 
amino terminus (19). The processing of this precursor involves 
two sequential events: (i) removal of the signal peptide from 
the precursor through an endo-type cleavage and (ii) digestion 
of the cleaved signal peptide. The membrane proteases in- 
volved are (i) signal peptidases (lipoprotein signal peptidase 
[Lsp] and leader peptidase [Lep]) and (ii) signal peptide pep- 
tidase (protease IV). The genes IspA (333), lep (42), and sppA 
(108, 276) for protease IV of E. coli have been characterized 
and mapped on E. coli chromosomal DNA. Protease IV was 
shown to be a tetramer of the sppA gene product. 

(ii) ATP-dependent proteases. ATP-dependent proteolysis 
plays a major role in the turnover of both abnormal proteins 
and a variety of regulatory proteins in both prokaryotic and 
eukaryotic cells. Three families of ATP-dependent proteases 
are found in E. coli: La (or Lon), Clp (or Ti), and FtsH (or 
HflB) proteases. Lon and Clp are soluble proteins, whereas 
FtsH is a membrane-anchored protein. 

In vitro studies on ATP-dependent proteolysis have shown 
that the major ATP-dependent activity in the extracts of E. coli 



cells is the Lon protease (73). The lon gene of £. coli K-12 has 
been cloned (334), sequenced (3, 35), and shown to be dis- 
pensable by insertional mutagenesis of the gene (180). Extracts 
from Lon-deficient E. coli cells still catalyze ATP-dependent 
proteolysis mediated by a soluble two-component protease, 
Clp. Two dissimilar components of Clp are (i) the ClpA reg- 
ulatory polypeptide, with two ATP-binding sites and an intrin- 
sic ATPase activity, and (ii) the ClpP subunit, with a proteo- 
lytic active site. Clp is a serine protease, and its nucleotide 
sequence (181) showed little homology to the known classes of 
serine proteases representing a unique family of serine pro- 
teases (182). 

The cleavage of proteins such as casein and albumin by Clp 
proteases requires both ClpP and the regulatory subunit ClpA 
and ATP, However, it has been observed that ClpP can inde- 
pendently catalyze endoproteolytic cleavage of short peptides 
at a lower rate than in the presence of ClpA and ATP. The 
gene encoding ClpP is, at 10 min on the £. coli map, nearer to 
the gene encoding the ATP-dependent Lon protease of E. coli 
and farther from the gene encoding ClpA. Primer extension 
experiments indicate that the transcription initiates immedi- 
ately upstream of the coding region for ClpP, with a major 
transcription start at 120 bases in front of the start of transla- 
tion. ClpP insertion mutants have been isolated, and strains 
devoid of ClpP are viable in the presence as well as the absence 
of Lon protease. Genetic evidence is available demonstrating 
that ClpA and ClpP act together in vivo (181). Processing of 
ClpP appears to involve an intermolecular autocatalytic cleav- 
age reaction which is shown to be independent of ClpA (182), 
A speculative model for the chaperone-like function of ATP- 
dependent proteases has been postulated by Suzuki et al. 
(275), The dual function of the ATP-dependent protease is 
determined by the affinity of the protein for the subunit or 
domain. Based on this, the ATP-dependent protease may reg- 
ulate the subunit stoichiometry of protein complexes. 

Miscellaneous. Among the bacterial representatives of the 
trypsin family, a-lytic protease, an extracellular enzyme of the 
gram-negative soil bacterium Lysobacter enzymogenes 495, is of 
particular interest. Nucleotide sequence analysis and SI map- 
ping of the structural gene for the a-lytic protease from L. 
enzymogenes 495 indicated that the enzyme is synthesized as a 
prepro-protein (41 kDa) that is subsequently processed to its 
mature extracellular form (20 kDa) (260). The gene was fur- 
ther expressed in £. coli by fusing the promoter and signal 
sequence of the £. coliphoA gene to the proenzyme portion of 
the a-lytic protease gene (261). Following induction, an active 
enzyme was produced both intra- and extracellularly. Fusion of 
the mature protein domain alone resulted in the production of 
an inactive enzyme, indicating that the large N-terminal pro- 
protein region is necessary for activity. Epstein and Wensink 
also cloned and sequenced the gene for a-lytic protease, a 
19.8-kDa serine protease secreted by L. enzymogenes (57). The 
nucleotide sequence contains an ORF which codes for the 
198-residue mature enzyme and a potential prepro-peptide, 
also of 198 residues. 

Achromobacter protease I (API) is a mammalian- type, ly- 
sine-specific serine protease that specifically hydrolyzes the 
lysyl peptide bond. The nucleotide sequence analysis of API 
from Achromobacter lyticus M497-1 revealed that the gene 
codes for a single polypeptide chain of 653 aa (208). The 
263-aa mature protein, which was identified by protein se- 
quencing, was found to be flanked N-terminally by 205 aa 
including a signal peptide and C-terminally by 180 aa. E. coli 
carrying a recombinant plasmid containing the API gene over- 
produced and secreted the protein (APT) into the periplasm. 
The N-terminal amino acid sequence of API' was the same as 
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that of mature API, whereas the enzyme retained the C-ter- 
minal extended polypeptide chain. The structural gene for 
p- lytic protease was cloned from A. lyticus, and the nucleotide 
sequence analysis of the gene revealed a mature enzyme of 179 
aa, with additional 195 aa at the N-terminal end of the enzyme, 
which includes the signal peptide (161). 

Characterization of a serine protease gene that cleaves spe- 
cifically on the carbonyl side of acidic residues from Staphylo- 
coccus aureus V8 revealed a 68-residue N-terminal extension 
which includes a 19- to 29-residue signal peptide, the mature 
protein, and the C-terminal region with several repeated acidic 
amino acid-rich tripeptides (29). The C terminus may function 
as a competitive inhibitor of the prepro-protein form of the 
enzyme, perhaps to prevent activity prior to secretion. 

Aqualysin I, an alkaline serine protease, is secreted into the 
culture medium by an extreme thermophile, Themus aquaticus 
YT-1. Aqualysin I shows high DNA sequence homology to the 
subtilisin-type serine proteases, especially in the regions con- 
taining the active-site residues (Asp32, His64, and Ser221) of 
subtilisin BPN' (148). The nucleotide sequence also revealed 
that the enzyme is produced as a large precursor, containing 
the N-terminal portion, the protease, and the C-terminal por- 
tion. 

The gene (tfgA) for the major extracellular protease of Ther- 
momonospora fusca YX was isolated, sequenced, and ex- 
pressed in Streptomyces lividans (152). The ORF encoded 375 
residues including a 31-residue potential signal sequence, an 
N-terminal 150-residue prosequence, and the 194-residue ma- 
ture protease belonging to chymotiypsin family. 

Alteromonas sp. strain 0-7, a marine bacterium, excretes 
alkaline serine proteases or subtilases (AprI and April) into 
the growth medium. The results of the deduced amino acid 
sequence analysis of genes for both AprI and April indicated 
that both the enzymes are produced as large precursors con- 
sisting of four domains: the signal sequence, the N-terminal 
pro-region, the mature AprI or April, and the C-terminal 
extension (298, 299). The amino acid sequence of mature AprI 
shows high sequence homology to that of class I subtilase, 
while the sequence of April shows high sequence homology to 
that of class II subtilase. Repeated sequences were observed in 
the C-terminal pro-region, showing high homology to se- 
quences from the C-terminal pro-region of other known gram- 
negative bacteria (K angiolyticum^ Xanthomonas campestris, 
and V, proteolyticus). 

IgA famiiy of proteases. Immunoglobulin Al (IgAl) pro- 
teases form a very heterogenous group of extracellular en- 
dopeptidases produced by a number of bacterial pathogens 
that colonize human mucosal surfaces. The enzymes specifi- 
cally cleave human IgAl, which participates in the immune 
system surveillance in the human mucosa. A number of reports 
(62, 224, 232) on the cloning of the iga gene, encoding the IgAl 
protease from Neisseria gonorrhoeae, are available. Nucleotide 
sequence analysis revealed that the enzyme is produced as a 
large precursor with three functional domains, i.e., the N- 
terminal leader peptide, the protease, and the carboxy-termi- 
nal "helper" domain. An overall structural similarity to the iga 
gene from N, meningitidis was also demonstrated (169). 

Comparison of the deduced amino acid sequence of the iga 
gene of Haemophilus influenzae serotype b with that of a sim- 
ilar protease from N. gononhoeae revealed several domains 
with a high degree of homology (228). An enzyme secretion 
mechanism analogous to that for M. gonorrhoeae IgAl protease 
was proposed for H. influenzae IgAl protease. Limited diver- 
sity has been found among the IgAl protease genes of H. 
influenzae, serotype b strains (230), information that is useful 
from the point of view of vaccine preparation. 



Cloning of streptococcal IgAl genes from Streptococcus san- 
guis ATCC 10556 (70) and S. pneumoniae (229, 308) has been 
reported. Hybridization experiments with an S. sangins IgAl 
protease gene probe showed no detectable homology to chro- 
mosomal DNA of gram-negative bacteria secreting IgAl pro- 
teases. The gene encoding IgAl protease from S. pneumoniae 
was identified by using the S. sanguis protease probe. However, 
the iga gene was found to be highly heterogenous among strep- 
tococcal species. 

From the foregoing, it can be seen that subtilisins (270) and 
neutral proteases (279, 323) of various Bacillus species, the 
a-lytic protease from L. enzymogenes (57, 260), and proteases 
A and B from S, griseus (89) have long polypeptide extensions 
at their N termini. The IgA protease of N. gonorrhoeae (224) 
and the protease of S. marcescens (322) have C-terminal ex- 
tensions. Achromobacter protease I (208), aqualysin I from T. 
aquaticus (148), and AprI and April from Alteromonas sp. 
strain 0-7 (298, 299) bear long peptide chains at both the N 
and C termini. The function of the pre-peptide portion (signal 
peptide) in these precursors is possibly to assist in the transport 
of the secretory protein across the cytoplasmic membrane. The 
exact role of the pro-peptide region is not known; possibly the 
long peptide serves to inhibit the mature protease to which it 
is connected (29, 57). It is also possible that the pro-peptide 
helps the protease to fold into its active form (111, 261). 

Fungi 

As in bacteria, cloning of the protease genes of fungi has 
been attempted from both the commercial and pathogenicity 
points of view. 

Filamentous fungi, (i) Acidic proteases, (a) Mucor. Two 
closely related species of zygomycete fungus, Mucor pusillus 
and Mucor mieliei, secrete aspartate proteases, also known as 
mucor rennins, into the medium. The enzymes possess high 
milk-clotting activity and low proteolytic activity, enabling 
them to be used as substitutes for calf chymosin in the cheese 
industry. 

Sequencing of the cloned gene encoding M. pusillus rennin 
(MPR) revealed an ORF without introns, encoding possible 
pre-pro-sequences (66 aa) upstream of the mature MPR se- 
quence (296). The deduced amino acid sequence showed a 
high degree of homology to that of M. miehei rennin (MMR). 
The gene encoding M. miehei aspartyl protease (MMAP) has 
also been cloned and sequenced (79). The deduced primary 
translation product showed an N-terminal extension which ap- 
pears to comprise a signal peptide of 22 aa and a propeptide of 
47 aa. Fungal aspartyl proteases are structurally related to each 
other and to the gastric aspartyl proteases chymosin and pep- 
sin; therefore, they may be activated in a manner similar to 
their gastric counterparts. When the gene encoding the pre- 
pro-form of MPR was cloned in 5, cerevisiae under the control 
of the yeast GAL7 promoter, an inactive zymogen of the en- 
zyme with the 44-aa pro-sequence was identified in the me- 
dium during the initial stage of cultivation (94). In vitro con- 
version of the zymogen to mature MRP was shown to proceed 
autocatalytically under the acidic conditions. 

(h) Rhizopus. Rhizopus niveus. belonging to the zygomycete 
class, also secretes aspartyl protease abundantly. The gene 
encoding R. niveus aspartic protease (RNAP) was cloned and 
sequenced (100). Comparison of the deduced amino acid se- 
quence with the amino acid sequence of rhizopuspepsin of R. 
chinensis (282) revealed that the RNAP gene has an intron 
within its coding region. A prepro-sequence of 66 aa upstream 
of the mature enzyme was also revealed. High-level secretion 
of RNAP-I was achieved by subcloning the RNAP-I gene into 



614 RAO ET AL. 



Microbiol. Mol. Biol. Rev. 



Saccharomyces cerevisiae (101). Yeast cells carrying the intact 
RNAP-1 gene under the control of the glyceraldehyde-3-phos- 
phate dehydrogenase gene promoter of S. cerevisiae were un- 
able to synthesize RNAP-I. On removal of the intron of the 
RNAP-I gene, the cell secreted the enzyme with high effi- 
ciency, 

(c) Aspergillus, The pepA gene encoding the aspartic pro- 
tease, aspergillopepsin A, from Aspergillus awamori (15), the 
pepA gene from A- oryzae (74), and the cDNA coding for an 
elastinolytic aspartic protease, aspergillopepsin F, from A. fu- 
migatus (156, 237) were cloned and sequenced. The nucleotide 
sequence data revealed that the ORFs encoding aspartic pro- 
teases in these aspergilli are composed of four exons. Prepro- 
peptides of 69, 78, and 70 aa were found to precede 395-, 326-, 
and 323-aa mature proteins of A. awamori, A. oryzae, and A. 
fwnigarus, respectively. The amino acid sequence of aspergil- 
lopepsin F shows 70, 66, and 67% homology to the sequences 
of those iromA. oryzae, A. awamori, and/4, saitoi, respectively. 
The primary structure of aspergillopepsin I from A. satoi 
ATCC 14332 (now designated A. phoenicis) was deduced from 
the nucleotide sequence of the gene (257), The cDNA of the 
gene was also cloned and expressed in yeast cells. 

Two types of acid endopeptidases, acid proteases A and B 
(commercially named proctase A and B), are known to be 
secreted into the medium by A. niger var. rjiacrosporus . Pro- 
tease B is a typical aspartic protease, inhibited by pepstatin, 
whereas protease A is not inhibited by pepstatin. Sequencing 
of the protease A gene revealed an 846-bp structural gene 
without any introns encoding the precursor form of the enzyme 
(114, 125, 283). The precursor, of 282 residues, includes an 
N-terminal prepro-sequence of 59 residues, the L chain of 39 
residues, an intervening sequence of 11 residues, and the H 
chain of 173 residues linked in that order. The deduced amino 
acid sequence (394 residues) of the prepro-form of protease B 
showed 98% homology to the sequences of aspergillopepsin I 
from A. awamori and >!. saitoi and 68% homology to that of 
aspergillopepsin I from v^. oryzae (113, 175). The cDNA was 
expressed in E. coli, and the purified pro-protease B showed 
protease activity under acidic conditions (pH 2 to 4). 

(ii) Alkaline proteases, (a) Aspetgillus. Alkaline protease 
(Alp) produced by A, oryzae, a filamentous ascomycete used in 
the manufacture of soy sauce, is considered to play an impor- 
tant role in producing the flavor of soy sauce by hydrolyzing the 
raw materials. Tatsumi et al. constructed the cDNA library of 
A, oryzae ATCC 20386 in pUC119 and isolated a cDNA (1,100 
bp) encoding the mature region of Alp (286), The nucleotide 
sequence of the cDNA lacked most of the DNA sequences 
corresponding to the prepro-region. The entire cDNA coding 
for prepro-Alp was cloned and expressed in S. cerevisiae (288), 
The character of the Alp secreted from S. cerevisiae was shown 
to be identical to that of the native Alp. The predicted mature 
Alp consists of 282 aa and shows homology to other serine 
proteases of subtilisin families from bacteria as well as from 
fungi. Alp has a 121-aa prepro-region wherein the N-terminal 
21 residues show the characteristics of a signal peptide. Alp 
expressed in S. cerevisiae was secreted with the N terminus 
processed correctly, analogous to the expression in 5. cerevisiae 
of aspartic protease from M. pusillus (321), The prepro-Alp 
cDNA of A. oryzae was further cloned into an osmophilic yeast, 
Zygosaccharomyces roiaii (207), The recombinant Z. rouxii se- 
creted a large amount of Alp (about 300 nig/liter) into the 
culture medium. The Alp gene is 1,374 nucleotides long and 
contains three introns, one in the pro-region and two in the 
mature protein region (195). 

A gene encoding Alp from the/4, oryzae Thailand industrial 
strain was isolated from the genomic library by using oligode- 



oxyribonucleotide probes based on the A. oryzae ATCC 20386 
cDNA sequence (33). By comparison with the published 
cDNA sequence (286), Alp from^l. oryzae Thailand was found 
to be encoded by four exons. Transformation of the alpA gene 
in the high-level-Alp-producing A, oryzae strain U212, ob- 
tained by UV mutagenesis, resulted in the production of up to 
five times as much Alp as in the parental strain. A. fwnigatus 
d^ndA. flavus, the agents of invasive aspergillosis, secrete highly 
homologous serine proteases. The genomic as well as cDNA 
clones encoding elastinolytic Alp from both^. fumigatus (123, 
237) and A. flavus (233) were sequenced. The A. nidulans prtA 
gene coding for Alp was isolated by using the gene encoding^. 
oryzae Alp (131). The nucleotide sequence oiprtA was deter- 
mined, and the deduced amino acid sequence showed a high 
degree of similarity to Alp from^. fumigatus, A, flavus, andv4. 
oiyzae. prtA transcription was shown to be dependent on the 
medium composition. 

(b) Acremonium. Acremonium chrysogemfm ATCC 11550 
{Cephalosporium acremonium) produces a considerable 
amount of extracellular Alp. The cDNA and genomic DNA 
encoding Alp were isolated from the A, c/irysogerwm cDNA 
and genomic DNA libraries, respectively (115), The nucleotide 
sequence of the gene was determined. The deduced amino acid 
sequence showed 57% homology to that of A. oryzae Alp, 
Cloning of the entire cDNA encoding/1. cht^'sogemwt Alp into 
S. cerevisiae directed the secretion of enzymatically active Alp 
into the culture medium. 

(c) Fusarium. The transfer of tUeFusarium alkaline protease 
gene (136) into^. chrysogenum resulted in transformants pro- 
ducing large amount of Alp (193). Southern hybridization 
analysis, as well as PCR of genomic DNAs from these trans- 
formants, showed chromosomal integration of the full-length 
alp gene. The enzyme secreted by A. chrysogenum had prop- 
erties identical to that of the native Fusarium Alp, indicating 
that the Alp promoter, signal sequence, and introns functioned 
correctly in ^4. chrysogenum. 

(iii) Serine proteases, (a) Tritirachium. Proteinase K is a 
serine endoproteinase excreted by the fungus Tritirachium al- 
bum Limber. The enzyme is able to hydrolyze native proteins 
rapidly and is active in the presence of detergents (urea, so- 
dium dodecyl sulfate, etc.), making the proteinase K one of the 
most useful tools in molecular biology. The enzyme exhibits 
strong similarity to the bacterial subtilisins. The genomic DNA 
as well as the cDNA encoding proteinase K from T. album 
Limber have been cloned in E. coli, and the entire nucleotide 
sequence of the coding region, including the 5'- and 3'-flanking 
regions, has been determined (81). The nucleotide sequence 
analysis revealed that the primary secreted product is a zymo- 
gen containing a 15-aa signal sequence and a 90-aa pro-pep- 
tide. The pro-peptide is presumably removed in the later steps 
of the secretion process or upon secretion into the medium. 
The proteinase K gene was shown to be composed of two exons 
and one 63-bp intron located in the proregion. The pro-pro- 
teinase K gene was expressed in E. coli under the control of the 
tac promoter. 

The coding sequence of proteinase T from T. album Limber 
(247) was shown to be interrupted by two introns. The deduced 
amino acid sequence showed 53% identity to that of proteinase 
K, The presence of four cysteines in the mature proteinase, 
probably in the form of two disulfide bonds, explains the ther- 
mal stability of proteinase T. The proteinase T cDNA was 
expressed in E. coli, and the authenticity of the product was 
confirmed by Western blotting and N-terminal analysis of the 
recombinant product, 

(iv) Metalloproteases. (a) Aspetgillus. Jaton-Ogay et al. 
(124) and Sirakova et al. (262) cloned and sequenced the gene 
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as well as the cDNA encoding the 42-kDa ehistinolytic metal- 
loproteinase (MEP) oi A, fumigatus. Comparison of the nucle- 
otide sequences revealed that the genomic and the cDNA 
sequences are analogous except for four introns interrupting 
the ORF. The enzyme was shown to be produced in a prepro- 
form, with a 384-aa mature protease region. In another study, 
no intron was found in the ORF of .4. flavus mep20 (encoding 
a 23-kDa MEP) whereas a 59-bp intron was present in the gene 
from A. fumigatus (a homolog of mep20) (234). The MEP20 
proteins of ^. flavus and /I. fumigatus have 68% identity. 

Yeasts, (i) Acidic proteases. The yeast Saccharomycopsis 
fibuligera produces an extracellular acid protease (PEPl). 
DNA coding for the secretable acid protease gene of S, fibu- 
ligera was isolated (95, 320). The enzyme produced by Saccha- 
romyces cerevisiae cells that are transformed with a plasm id 
carrying the cloned gene showed enzymatic properties similar 
to those of the 5. fibuligera protease. 

Two different groups of workers (4, 315) from the United 
States worked simultaneously on the PEP4 gene of 5. cerevi- 
siae, which encodes an aspartyl protease implicated in the 
posttranslational regulation of the yeast vacuolar hydrolases. 
The PEP4 gene was isolated from a genomic library by comple- 
mentation of the PEP4-3 mutation. The nucleotide sequence 
was deduced, and the predicted amino acid sequence showed 
substantial homology to that of the aspartyl protease family. 

The deduced primary translation product (587 aa) of harl, 
the structural gene for the barrier activity of cerevisiae, has a 
putative signal peptide and nine potential asparagine-linked 
glycosylation sites (177). Marked sequence similarity to pepsin- 
like proteases was observed. 

A gene for yeast aspartyl protease 3 (YAP3) allowing KEX- 
2-independent MFa pro-pheromone processing was isolated 
from S. cerevisiae (53). The nucleotide sequence of the YAP3- 
encoding gene was determined, and the deduced amino acid 
sequence was shown to exhibit extensive homology to a num- 
ber of aspartyl proteases, including the PEF4 and BARl pro- 
teins of S. cerevisiae. A potential transmembrane domain sim- 
ilar to that found in the KEX-2 gene product was also located. 

Candida albicans and Candida tropicalis are the medically 
more important opportunistic pathogens causing infections in 
immunocompromised patients. Their secretory proteolytic ac- 
tivity is considered to be a major virulence factor. The deduced 
amino acid sequence of the acid protease (ACP) from C 
tropicalis shows similarity to the amino acid sequence of the 
pepsin family (294), The aspartyl proteinase gene (106, 170) 
and cDNA (196) from various C. albicans strains were cloned 
and sequenced. The genes for secreted aspartic proteases (the 
SAPl, SAP2, SAPS, and SAP4 genes) in C. albicans constitute 
a multigene family. Three putative new members, SAP5, SAP6, 
and SAP7, were also isolated and sequenced. Evidence was 
also obtained for the existence of SAP multigene families in 
other Candida species such as C tropicalis, C. parapsilosis, and 
C guilliennondii (191). 

The amino acid sequence of an acid extracellular protease 
(AXP) from Yairowia lipolytica 148 deduced from the nucleo- 
tide sequence revealed a putative 17-aa pre-peptide, a 27-aa 
pro-peptide, and a 353-aa mature protein (37 kDa) (331). AXP 
showed homology to proteases of several fungal genera. The 
transcription of both AXP and the alkaline extracellular pro- 
tease (AEP) genes in y. lipolytica was shown to be regulated by 
the pH of the culture (331). 

A gene encoding an extracellular protease was cloned from 
a wild- type yeast into brewer's yeast, 5. cerevisiae (332). Such 
genetically engineered strains carrying the gene for an extra- 
cellular protease were shown to exhibit chill-proofing activity 
in beer. Proteins remaining in beer after its brewing from malt 
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tend to form hazes during chilling due to their poor solubility 
at lower temperatures. Acid proteases assist in reducing the 
haze formation by degrading the proteins in beer without af- 
fecting foam stability or organoleptic properties such as taste. 

(ii) Alkaline protease. The XRP2 gene for AEP from Y. 
lipolytica encodes a putative 22-aa pre-peptide followed by a 
135-aa pro-peptide containing a possible N-linked glycosyla- 
tion site and the two Lys-Arg peptidase-processing sites (44, 
201). The mature protease (297 aa) contains two potential 
glycosylation sites. 

(iii) Serine proteases, (a) Kluyveromyces. The KEX-1 gene 
product is required for the production of a killer toxin by 
Kluyveromyces lactis. The deduced amino acid sequence (700 
aa) encoded by KEX-I showed an internal domain with a 
striking homology to the sequences of the subtilisin-type pro- 
teinases (285). 

(b) S. cerevisiae. The KEX-2 gene, encoding a subtilisin-like 
endoprotease responsible for posttranslational processing of 
certain gene products, contains a 2,442-bp ORF encoding a 
polypeptide of 814 aa (188, 212), The deduced amino acid 
sequence revealed a region near the N terminus that has ex- 
tensive homology to the subtilisin family of serine proteases. A 
putative membrane-spanning domain near the C terminus was 
also detected. The wild type and the C-terminal deletion de- 
rivatives showed similar substrate specificities, with the highest 
activity being against Arg-Arg dipeptides, 

(iv) Other proteases. Yeast carboxypeptidase (CPY) is a 
glycosylated yeast vacuolar protease that is used commercially 
in peptide synthesis. CPY is encoded by the PRCl gene. To 
increase the production of CPY in S. cerevisiae, PRC! was 
placed under the control of the strongly regulated yeast GALl 
promoter on the multicopy plasmids and introduced into vpll 
mutant strains (202). About a 200-fold increase in the level of 
secreted CPY (40 mg/liter) was obtained compared to the level 
in a vpll mutant carrying a single copy of the wild-type PRC-} 
gene. Sodium dodecyl sulfate-polyacrylamide gel electrophore- 
sis revealed two forms of secreted active CPY, probably due to 
the different levels of glycosylation. The structural gene PRBl, 
encoding the vacuolar protease B of 5. cerevisiae, was cloned by 
complementation of the prbI-1122 mutation (190). 

PRGl, a yeast gene encoding the 32-kDa proteasome, which 
shows 55.6% sequence homology to 80% of the RINGIO gene 
product (human proteasome), was identified (65). Genomic 
disruption of PRGI revealed that it is essential for yeast 
growth. The results strongly indicate that the antigen-process- 
ing system present in vertebrates has evolved from a basic 
cellular process present in all organisms. 

Viruses 

Gene cloning of viral proteases has been undertaken for the 
isolation and overexpression of the gene and for subsequent 
screening of inhibitory compounds that may be used in the 
development of chemotherapeutic agents. Viral protease is 
responsible for processing of polyprotein precursors into the 
structural proteins of the mature virion. Among viruses, re- 
ports on cloning of protease genes are limited mainly to animal 
viruses (Table 5). 

Animal viruses, (i) Herpesviruses. Each member of the her- 
pesvirus family encodes a unique serine protease in association 
with a capsid assembly protein, with the associated ORFs being 
designated U^SO and Ui^26 in human cytomegalovirus 
(HCMV) and in herpes simplex virus type-1 (HSV-1), respec- 
tively. The Ul26 gene encodes a protease responsible for the 
C-terminal cleavage of the nucleocapsid-associated proteins 
(ICP35C and ICP35d) to their posttranslationally modified 
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counterparts (lCP35e and ICP35f). The protease expressed in 
E. coll exhibited autoprocessing and specifically cleaved the 
ICP35 protein assembly (47). Similarly, genes encoding pro- 
teases from HSV-2, murine cytomegalovirus (MCMV), and 
human herpesvirus 6 (HHV-6) have been studied (172, 271, 
292). Such studies assist in the investigation of the role of 
proteolytic processing in the virus. 

(ii) Adenoviruses. Adenoviruses code for a serine-centered, 
neutral protease specific for selected Gly-Ala bonds in several 
virus-encoded precursor proteins that are required for virion 
maturation and infectivity. To determine the functional do- 
mains of this key enzyme, protease genes from various types of 
adenoviruses have aI.so been cloned and sequenced (102-104, 
306). 

(iii) Retroviruses. The genomic organization of retroviruses 
is 5' -LTR-gag-pro-pol-env 'LTR-3' (where LTR is a long termi- 
nal repeat). The prolprt gene product is an aspartyl protease, 
which is responsible for processing the gag and pol polyprotein 
precursors into the structural proteins of the mature virion. 
Comparison of the genomic organization of certain retrovi- 
ruses revealed that prt lies in the carboxyl terminus of ^^g in 
Rous sarcoma virus (RSV) (252) and avian sarcoma leukosis 
virus (ASLV) (144); in the amino terminus of pol in AIDS- 
associated retrovirus type 2 (ARV-2) (248); in the same read- 
ing frame as both gag and pol in Moloney murine leukemia 
virus (M-MuLV) (256); and as a separate reading frame in 
simian AIDS retrovirus type I (SRV-1) (231), human T-cell 
leukemia virus type 2 (HTLV-2) (255), bovine leukaemia virus 
(BLV) (245), and Mason-Pfizer monkey virus (MPMV) (268). 
Besides cloning and sequencing of the prt gene, there are a few 
reports on expression of the gene in E, coll (40, 105, 144). 
Significant inhibition of the expressed protease activity by pep- 
statin A confirmed that HTLV-1 protease is a member of the 
aspartyl protease group (246). 

Human immunodeficiency virus (HIV), a causative agent of 
AIDS, is also a member of the family Retroviridae. The virus 
exhibits the same overall gag-pol-env genome organization as 
that of other retroviruses. The genome-size ni RNA of HIV-1 is 
translated into two polyproteins: Pr55 (gag gene product) and 
Prl60 igag-pol gene product). Cleavage of these polyproteins 
by the viral protease into smaller structural proteins and rep- 
lication enzymes such as reverse transcriptase and integrase is 
necessary to produce infectious progeny virions from imma- 
ture virus particles. The enzyme, a part of the polyprotein, has 
a highly conserved sequence, Asp-Thr-Gly, which is homolo- 
gous to the active site of the aspartic proteases and is thought 
to belong to this enzyme family (216). The protease is essential 
for the retroviral life cycle, as indicated by the production of 
noninfectious, replication-deficient virions by Moloney murine 
leukemia virus variants mutated in the protease-encoding re- 
gion (130). This suggests that HIV protease is a good target for 
chemotherapy and that specific inhibitors of this enzyme may 
have a significant function in the treatment of AIDS without 
interfering with the host cell physiology. To obtain sufficient 
quantities of the HIV protease for biochemical and structural 
analyses, several groups have described expression of the re- 
combinant HIV-1 protease in E. coli (46, 78, 171). Pichuantes 
et al. have reported extracellular expression of HIV-1 aspartic 
protease in S. cerevisiae (222). The expressed enzyme was 
shown to exhibit a proteolytic activity, as has been shown to be 
associated with the purified HIV-1 virions (164). Debouck et 
al. expressed the HIV protease gene product in E. coli (46). 
The product was shown to autocatalyze its maturation from a 
larger precursor and to process an HIV Pr55 gag protein when 
coexpressed in E. coli. This allowed a structure-function anal- 
ysis of the HIV protease and provided a simple assay for the 



development of potential therapeutic agents directed against 
the critical viral enzyme. 

(iv) Picornaviruses. Human rhinovirus is a member of the 
picornavirus (small RNA) family. Rhinovirus has commercial 
importance since it is the causative agent of about 15% of cases 
of the common cold. A cDNA encoding the viral protease from 
the 3C region of human rhinovirus type 14 was expressed in E. 
coli through the use of a periplasmic secretion vector (162). A 
comparison of the 3C protease regions of all the available 
picornavirus (foot-and-mouth disease virus, encephalomyocar- 
ditis virus, and poliovirus) sequences revealed two completely 
conserved residues, Cysl47 and Hisl61, which may be the 
reactive residues of the active sites of these cysteine proteases 
(6). 

Plant viruses. Potyviruses are a cause of serious losses of 
several major crop plants. In plants infected with the potyvi- 
ruses, inclusions consisting of viral proteins are found in the 
cell nucleus. One of them, the nuclear inclusion protease 
(NIa), is the major viral protein responsible for the proteolytic 
maturation of the polyprotein encoded by the virus. The elu- 
cidation of the structure of such virus-encoded proteins could 
eventually facilitate the design of novel polypeptides which 
bind to them and inhibit their functions. With this objective, 
cDNAs for NIa proteases were cloned and sequenced from 
bean yellow mosaic virus (22) and zucchini yellow mosaic virus 
(Singapore isolate) (316). 

The potential contributions of genetic engineering to man- 
kind are enormous and will benefit agriculture, animal hus- 
bandry, environmental protection, food production and pro- 
cessing, human health care, manufacture of biochemicals and 
biofuels, etc. In general, the application of genetic engineering 
to proteases will facilitate their use in industry and enable the 
development of therapeutic agents against the proteases that 
are important in the life cycle of organisms which cause serious 
diseases. 

PROTEIN ENGINEERING 

Many industrial applications of proteases require enzymes 
with properties that are nonphysiological. Protein engineering 
allows the introduction of predesigned changes into the gene 
for the synthesis of a protein with an altered function that is 
desired for the application. Recent advances in recombinant 
DNA technology and the ability to selectively exchange amino 
acids by site-directed mutagenesis (SDM) have been respon- 
sible for the rapid progress of protein engineering. Identifica- 
tion of the gene and knowledge of the three-dimensional struc- 
ture of the protein in question are the two main prerequisites 
for protein engineering. The X-ray crystallographic structures 
of several proteases have been determined (39, 143, 223, 267). 
Proteases from bacteria, fungi, and viruses have been engi- 
neered to improve their properties to suit their particular ap- 
plications. 

Bacteria 

Subtilisin has been chosen as a model system for protein 
engineering since a lot of basic information about this com- 
mercially important enzyme is available. Its pH dependance 
(290), catalytic activity (278, 281), stability to heat or denatur- 
ing agents (112, 199), and substrate specificity (10, 14, 30, 59, 
243) have been altered through SDM. A slightly reduced rate 
of thermal inactivation was obsewd for a .subtilisin BPN' 
variant containing two cysteine residues (Cys22, Cys87) (186, 
214). Oxidation of Met222 adjacent to the Ser221 in the active 
site of subtilisin reduces the catalytic activity of subtilisin. The 
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effect of substitution of Met222 with different amino acids 
revealed that small side chains yield the highest activity. The 
mutant enzymes Ser222, Ala222, and Leu222 were active and 
stable to peroxide for 1 h. Probing of the specificity of the Sj 
binding site of Met222 Cys/Ser mutants of subtilisin from Ba- 
cillus lentiis with boronic acid inhibitors revealed similar bind- 
ing trends for the mutant and the parent (269). The disulfide 
bonds introduced into subtilisin away from its catalytic center 
were shown to possess increased autolytic stability (312). 
Higher thermostability of subtilisin E as a result of introduc- 
tion of a disulfide bond engineered on the basis of structural 
comparison with a thermophilic serine protease has been re- 
ported (280). Strausberg et al. have created the environment 
for stabilization of subtilisin by deleting the calcium-binding 
loop from the protein (273). Analysis of the structure and 
stability of the prototype with the loop deleted followed by 
SDM resulted in a mutant with native proteolytic activity and 
1,000-fold-greater stability under strongly chelating conditions. 
SDM-mediated substitution of Asn241 buried in the neutral 
protease of B. stearothermopliilus by leucine resulted in an 
increase in thermostability of 0.7 ± O. PC (55). The thermo- 
stability of the neutral protease from B, subtilis was increased 
by 0.3 and l.OX by replacing Lys with Ser at positions 249 and 
290, respectively, whereas the Asp249 and Asp290 mutants 
exhibited an increased stabilization by 0.6 and l.TC, respec- 
tively (54), 

A protein engineering study was undertaken by Bruinenberg 
et al. to determine the functions of one of the largest loop 
insertions (residues 205 to 219), predicted to be spatially close 
to the substrate-binding region of the SKll protease from L. 
delbrueckii and susceptible to autoproteolysis (28). Deletion or 
modification of this loop was shown to affect the activity and 
autoprocessing of the protease. Graham et al. showed that 
random mutagenesis of the substrate-binding site of a-lytic 
protease, a serine protease secreted by the soil bacterium Ly- 
sobacter enzymogenes, generated enzymes with increased activ- 
ities and altered primary specifities (77). Substitution of Hisl20 
by Ala in the LiisA protease of P. aeruginosa yielded an enzyme 
devoid of staphylolytic activity. Thus, Hisl20 was shown to be 
essential for LasA activity (82). 

Fungi 

Fungal aspartic proteases are able to cleave substrate with 
''Lys" in the PI position. Sequencing and structural compari- 
son suggest that two aspartic acid residues (Asp30 and Asp77) 
may be responsible for conferring this unique specificity. 
Lowther et al. engineered the substrate specificity of rhizopus- 
pepsin from Rhizopus niveus and demonstrated the role of 
Asp77 in the hydrolysis of the substrates with lysine in the P-1 
position (173). 

The primary structure of aspergillopepsin I from Aspergillus 
saitoi ATCC 14332 (now designated^, phoenicis) was deduced 
from the nucleotide sequence of the gene (257). To identify the 
residue responsible for determining the specificity of aspergil- 
lopepsin I toward the basic substrates in the substrate-binding 
pocket, Asp76 was replaced with a Ser residue by SDM. The 
striking feature of this mutation was that only the trypsinogen- 
activating activity of the enzyme was destroyed, suggesting the 
importance of the Asp76 residue in binding to basic substrates. 

To elucidate whether the processing of the pro-region oc- 
curs by autoproteolysis or by involving a processing enzyme, 
Tatsumi et al. changed Ser228 to Ala l^y SDM (287). 5. cerevi- 
siae cells harboring a recombinant plasmid with mutant Alp did 
not secrete active Alp into the culture medium. The yeast cells 
accumulated a protein of 44 kDa, probably a precursor of Alp 



(the 34-kDa mature Alp plus the 10-kDa pro-peptide), sug- 
gesting that autoproteolytic processing of the pro-region was 
occurring. 

Introduction of a disulfide bond by SDM is known to en- 
hance the thermostability of a cysteine-free enzyme. Aqualysin 
I, a thermostable subtilisin-type protease from Theimus aquati- 
cus YT-1, contains four Cys residues forming two disulfide 
bonds (149). The primary structure of Alp showed 44% ho- 
mology to that of aqualysin I, and sites for Cys substitutions to 
form a disulfide bond were chosen in the Alp based on this 
homology. Ser69, GlylOl, Glyl69, and Val200 were replaced 
by Cys in the mutant Alp. Both Cys69-Cysl0i and Cysl69- 
Cys200 mutant Alps were expressed in S. cerevisiae, and the 
enzymes were purified to homogeneity. The Cysl69-Cys200 
disulfide bond was shown to increase the thermostability as 
well as the thermoresistance oi Aspergillus oryzae Alp (110). 

In vitro mutation of an aspartic acid residue predicted to be 
in the active site abolished the barrier activity of S. cerevisiae 
(177). BARl possesses a carboxyl -terminal domain of an un- 
known function, and deletion of 166 of 191 aa of this region 
had no significant effect on the barrier activity. 

Viruses 

The protease of HCMV was rendered stable by conversion 
of one of the three VaI141, Va!207, or Val254 residues to Gly 
by SDM (151). The resulting stable proteases are useful as 
screening tools for HCMV antiviral agents and as diagnostic 
tools for diseases resulting from HCMV infection. 

Replacement of Asp64, a residue from the catalytic core 
sequence among aspartyl proteases, with Gly was shown to 
abolish the correct processing of the 53K gag precursor by 
HTLV-1 gag protease (87). 

In poliovirus, the mutation of highly conserved residues, e.g., 
Cysl47 or Hisl61, produced an inactive enzyme while muta- 
tion of a nonconserved residue, Cysl54, had only a negligible 
effect on the proteolytic activity (117). 

The protein-engineering technique has been exploited suc- 
cessfully for obtaining proteases which show unique specificity 
and/or stability at high temperature and pH. It has also con- 
tributed substantially to our understanding of the structure- 
function relationship of proteases. In future, protein engineer- 
ing will offer possibilities of generating proteases possessing 
entirely new functions. 

SEQUENCE HOMOLOGY 

Studies of DNA and protein sequence homology are impor- 
tant for a variety of purposes and have therefore become 
routine in computational molecular biology. They serve as a 
prelude to phylogenetic analysis of proteins and assist in pre- 
dicting the secondary structure of DNA and proteins. Pro- 
teases are a complex group of enzymes and vaiy enormously in 
their physicochemical and catalytic properties. The nucleotide 
and amino acid sequences of a number of proteases have been 
determined, and their comparison is useful for elucidating the 
structure-function relationship (5). The homology of proteases 
with respect to the nature of the catalytic site has been studied 
(12, 13). Accordingly, the enzymes have been allocated to 
evolutionaiy families and clans. It has been suggested that 
there may be as many as 60 evolutionaiy lines of peptidases 
with separate origins. Some of these contain members with 
quite diverse peptidase activities, and yet there are some strik- 
ing examples of convergence (236). 

A number of reports on the homology of proteases are 
available. Takagi et al. found that the thermostable proteases 
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of Bacillus stearothennophilus and B. thermoproteolyticus are 
85% homologous and the thermolabile proteases of B. suhtilis 
and B. amyloliquefaciens are 82% homologous, whereas the 
thermostable protease of B, stearothermophilus is only 30% 
homologous to the thermolabile protease of B. suhtilis (279). 
However, an amino acid sequence of 17 residues, which also 
includes the active-site histidine residue, was found to be 
highly conserved in all four neutral proteases, suggesting that 
they have the same three-dimensional structure around the 
active site despite the difference in their source and physico- 
chemical properties such as thermostability. 

Koide et al. compared the amino acid sequences of intracel- 
lular serine proteases from B. suhtilis with those of subtilisin 
Carlsberg and subtilisin BPN' and showed that they were 45% 
homologous (138). The sequence around the catalytic triad of 
serine, aspartate, and histidine is highly conserved, suggesting 
that the genes for both the intracellular and extracellular pro- 
teases have evolved from a common ancestor by divergent 
evolution (200). 

The amino acid sequence of an extracellular alkaline pro- 
tease, subtilisin J, is highly homologous to that of subtilisin E 
and shows 69% identity to that of subtilisin Carlsberg, 89% 
identity to that of subtilisin BPN', and 70% identity to that of 
subtilisin DY. The amino acid sequence of subtilisin J is com- 
pletely identical to that of the protease from R amylosaccha- 
riticus except for two amino acid substitutions, Thrl30 to 
Serl30 and Thrl62 to Serl62, in addition to one amino acid 
substitution in the signal peptide and two in the propeptide 
region. The probable active-site residues of subtilisin J, i.e., 
Asp32, His64, and Ser221, are identical to those of other sub- 
tilisins from Bacillus. Therefore, it was concluded that the 
alkaline protease from B, stearothermophilus is a subtilisin. 
Similarly, the various Bacillus serine alkaline proteases, such as 
bacillopeptidase F, subtilisin, Epr, and ISP-1, show consider- 
able homology and conserved amino acids around the active- 
site residues, i.e., Ser, Asp, and His (265). 

The extracellular proteases of B. suhtilis are synthesized as 
prepro-enzymes. Four neutral proteases from bacilli with 
known pro-sequences were compared, and considerable ho- 
mology within the pro-peptide region was observed (297). 
Since the pro-peptide region mediates the folding of the pro- 
tease, it would be interesting to learn about the residues es- 
sential for folding and to determine whether the mechanism of 
folding is similar in these proteases. Sequences corresponding 
to the mature form of these enzymes were compared by using 
thermolysin sequence as a reference. The zinc-binding site 
(Hisl42, Hisl46, and Glul66) and the residues participating in 
the catalytic reaction and positioning of the substrate back- 
bone in the active site (Asnll2, Alall3, Glul43, Tyrl57, and 
His231) were found to be conserved. Differences in these 
might lead to altered substrate specificities. Of the four calci- 
um-binding sites in thermolysin, two sites, i.e., sites 3 and 4, are 
absent in the thermolabile neutral proteases of B. amylolique- 
faciem and B. suhtilis (NprA) whereas in NprB, Asnl87 in site 
3 is replaced by Arg. Such changes are responsible for the loss 
of thermostability and can be detected by sequence homology 
studies. 

Alkaline proteases from various species of Aspej-gillus also 
show a high degree of homology (131). Alp from A. oryzae 
shows considerable homology (29 to 44%) to the members of 
the subtilisin family with conserved active-site residues (288). 
However, Alp exhibits little homology to mammahan serine 
proteases such as trypsin and chymotrypsin. The deduced 
structure of the KEX-1 protein, required for the production of 
the killer toxin of Kluyveromyces lactis contains an internal 
domain with a striking homology to the sequences of subtilisin- 



type proteases (242). Therefore it was deduced that the prod- 
uct of the KEX-1 gene of K. lactis is a protease involved in the 
processing of the toxin precursor. 

The characteristic of tiypsin-related enzymes is the presence 
of disulfide bonds, which are absent in all known subtilisins. 
Proteinase K from Tritirachium album Limber is a single chain 
protein of 277 aa with two disulfide bonds at positions 34-124 
and positions 179-248 and a free -SH group at position 73. 
Sequences around the active-site residues correspond to those 
around the active-site residues of subtilisins. Comparison of 
the proteinase K sequence with known subtilisins shows 35% 
homology and 44% sequence identity to thermitase, which is 
indicative of a relationship between proteinase K and the sub- 
tilisin family. It is likely that these enzymes have evolved from 
a common ancestral precursor serine proteinase (122). How- 
ever, there is a distinct difference between the typical sub- 
tilisins and proteinase K, since the latter has two disulfide 
bonds, which are lacking in subtilisins. Therefore, it has been 
assumed that the two progenitors diverged from an ancestral 
proteinase, separating the subtilisin-related enzymes into two 
subclasses: (i) cysteine-containing subtilisins e.g., proteinase K 
and thermitase, and (ii) cysteine-free subtilisins, e.g., subtilisin 
Novo, Carlsberg, or DY. 

The proteasome or multicatalytic endopeptidase complex is 
a high-molecular-mass multisubunit complex that is ubiquitous 
in eukaryotes and also found in the archaebacterium Tliermo- 
plasma acidopliilum (336). While eukaryotic proteasomes con- 
tain 15 to 20 different subunits, the archaebacterial proteasome 
is made of only two different subunits (a and p), yet the 
complexes are almost identical in size and shape. The a (233- 
aa) and p (211-aa) subunits of T acidophihun have a sequence 
identity of 24% and an overall similarity of 47%, indicating 
that the genes encoding the two subunits arose from a common 
ancestor. All the sequences of proteasomal subunits from eu- 
karyotes available to date can be related to either the a or p 
subunit of the T. acidophilum "urproteasome," and they can be 
distinguished by the presence or absence of a highly conserved 
N-terminal extension which is characteristic of a-type subunits. 
In terms of evolution, the genes for these a and p subunits can 
be considered paralogous (genes resulting from duplication 
and divergence of one gene within one genome) and therefore 
are able to acquire different functions. The a subunit of the T. 
acidophilum proteasome shows sequence similarity to the 5. 
cerevisiae wild-type suppressor gene sc/i -encoded polypeptide, 
which is probably identical to the subunit YC7-a of the yeast 
proteasome. This lends support to a putative role of protea- 
somes in the regulation of gene expression (337). The amino 
acid sequence of Xenopus proteasome subunit XC3 is highly 
homologous (95.3%) to those of the rat RC3 and human HC3 
subunits (66). The presence of an accessible nuclear targeting 
signal at the C terminus of the subunits suggests that it is 
probably involved in the regulation of the cellular distribution 
of the proteasome. 

The secretable acid protease of the yeast Saccharomycopsis 
fihuligera carries a hydrophobic ami no-terminal segment of 
about 20 aa which resembles signal sequences found in a wide 
variety of secretory protein precursors (95). Alignment of this 
sequence with the aspartyl protease family showed significant 
homologies, especially in the regions surrounding the two ac- 
tive-site aspartate residues. These results suggest that the 
PEP] gene is a structural gene for the secretable acid protease 
of S. fihuligera. The aspartic protease from Rhizopus niveus 
(RNAP) shows 76% homology to rhizopuspepsin, 42% homol- 
ogy to penicillopepsin, and 41% homology to human pepsin 
(100, 101). The homology between RNAP and rhizopuspepsin 
is found throughout their structures. Based on this homology, 
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an intron within the coding region and a prepro-enzyme se- 
quence of 66 aa upstream of the mature enzyme were detected 
in RNAP. Studies of the homology of proteases have shown 
that the residues involved in the substrate and metal ion bind- 
ing, catalysis, disulfide bond formation and active-site for- 
mation are conserved. Analysis of sequence homology is 
used in deciphering the structure-function relationship of 
proteases. 



EVOLUTIONARY RELATIONSHIP OF PROTEASES 

Proteases are present in all living organisms and are consid- 
ered to have arisen in the earliest phases of biological evolu- 
tion, some 1 billion years ago. Comparisons of amino acid 
sequences, three-dimensional structures, and mechanism of 
action of proteases assist in deciphering of their course of 
evolution. Changes in molecular structure have accompanied 
the demands for altered functions of proteases during evolu- 
tion. We have compiled the amino acid sequences of proteases 
from diverse origins such as microbes, plants, and animals and 
have arranged them in three different groups based on the pH 
of their action. These sequences, which have been selected 
from SWISS-PROT and PIR entries, are of comparable length 
and have been aligned with CLUSTAL W software for multi- 
ple alignments (291) (Fig. 4). 

Acidic Proteases 

The proteases selected here for comparison of amino acid 
sequences are active between pH 2 and 6. They include mostly 
aspartic proteases and also some of the cysteine proteases and 
metalloproteases. They are about 380 to 420 aa long and have 
different amino acid residues constituting the active site, as 
shown in Table 6. The homology between these acidic pro- 
teases is shown in Fig. 4A. The sequences belonging to pepsin 
family (Al) are grouped and are aligned below the other se- 
quences. As expected, there is considerable homology among 
these five acidic proteases. The sequences around the two 
aspartic residues (D97 and D258, residues numbered accord- 
ing to the Bajra protease) constituting the active site are con- 
served. Among these five proteases, the rat and monkey pro- 
teases show maximum homology (68.4%) and are related to 
the mosquito lysosomal aspartic protease. When four monkey 
pepsinogens which show development-dependent expression 
were compared, a very high homology was observed (126). 
Pepsinogens A-1 and A-2/3 differed in seven amino acids and 
only in five amino acids when the pepsin moiety alone was 
examined. The mosquito lysosomal protease is very closely 
related to human cathepsin D, exhibiting 92% homology 
(37). 

The amino acid sequences of C tropicalis and Saccharomy- 
copsis fibuligera show considerable homology (42.6%). High 
similarity scores were obtained when the acid protease from C 
tropicalis was compared with Rhizopus aspartic proteases, hu- 
man pepsinogen A precursor, protease A from yeast, the bar- 
rier protein from S. cerevisicie, and an acidic protease from S, 
fibuligera (294). 

The cysteine protease from Hordeum vulgare shows some 
homology to the snake venom metalloprotease from Crota- 
lus atroxy which is not statistically significant, whereas the 
Gpr protease from Bacillus megaterium, which plays a vital 
role in spore germination, shows least homology to all other 
acidic proteases but shares one of the active-site aspartate 
residues (D258) with them. The Gpr acidic proteases of B. 
subtilis and B. megaterium showed 68% identity in their 
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sequences, but comparison of the B. subtilis Gpr amino acid 
sequence with that of its serine protease or metalloprotease 
revealed no significant homologies (274), which supports 
our observations. This suggests that the genes encoding 
these proteases do not have a common ancestor or that if 
they do so, they have undergone much divergence. The lack 
of homology between the spore protease and other B, sub- 
tills proteases can be explained by differences in their prop- 
erties such as the number of subunits and sequence speci- 
ficity for the substrate. Thus, our results, in agreement with 
previous reports, indicate that the extent of homology is 
greater if the proteases belong to the same family and that 
in the same family the homology is greater if the phyloge- 
netic distance is shorter. 

A pairwise computer comparison also provides more infor- 
mation about the evolutionary relationships between the mem- 
bers of the different families. The dendrograms generated by 
this analysis, using the Tree View package (213), demonstrate 
the relationship among the proteins based on the similarity of 
the amino acid sequences (Fig. 5a). 

Neutral Proteases 

The neutral proteases, which are active at neutral or weakly 
alkaline or weakly acidic pH, include cysteine proteases, met- 
alloproteases, and some of the serine proteases. Brenner (25) 
has pointed out that the two codons for serine TCN and AGY 
cannot be interconverted by single nucleotide mutations but 
can be connected by two other codons, ACN for threonine and 
TOY for cysteine. Thus, there can be at least two different lines 
of descent for the active-site sequences of the serine proteases. 
The simplest pathway for this convergent evolution is by the 
divergence of each line from a precursor which was itself cat- 
alytically active and had much the same sequence. It was fur- 
ther demonstrated that modern serine enzymes are likely to 
have arisen from cysteine precursors. These findings encour- 
age the search for evidence to connect the presumed and 
existing cysteine sequences with their postulated metalloen- 
zyme predecessors. For this search and construction of phy- 
logenetic trees, gene structure is important. Thus, multiple 
lines of descent can be realistically considered in situations 
with sequence similarity but with differences in gene struc- 
ture. 

The neutral proteases selected for sequence analysis in the 
present study are in the size range of 225 to 275 aa (Table 6). 
The homology between them is shown in Fig. 4B. Of 14 pro- 
teases, 9 belong to the TIA or proteasome A family of the 
multicatalytic endopeptidase complex. The sequences of the 
proteasomal subunits aligned here can be related to the a 
subunit of the Thermoplasma proteasome and show consider- 
able homology. It is still not clear to which family of proteases 
the proteasomes belong (93). As in the cysteine family of 
proteases, all nine proteasome subunits show a conserved pro- 
line residue (P-17), which may serve to prevent unwanted N- 
terminal proteolysis (12). Many residues at the N terminus are 
highly conserved, which is a characteristic of the a-type sub- 
units. The similarity decreases toward the C terminus which 
appears to be rather variable (337). Although the p subunit 
shows no sequence motif characteristic of serine proteases, it 
contains all the essential amino acid residues forming the cat- 
alytic triad or the "charge relay system" (Ser, Asp, and His). 
These residues are found to be conserved (Serl6, His73, and 
Asp84), except for the histidine in the a subunits of Thenno- 
plasma, yeast (5. cerevisiae), and Caenorhabditis elegans (resi- 
dues are numbered according to the Thermoplasma a -subunit 
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sequence). Therefore, it is possible that the active site is shared 
between the a and p subiinits (336). The tyrosine autophos- 
phoiylation site at Tyrl23 is conserved in six of the nine se- 
quences. The cAMP/cGMP-dependent phosphorylation sites 
between a a 31 and 37 are found only in Themoplasma and 
Drosophila spp. (84), as reported by Zwtckl et al. (337). A 
consensus nuclear localization signal between aa 49 and 56 
(240) and a region complementary to the nuclear localization 
signal consensus sequence (326) between aa 201 and 212 can 
be identified in these sequences. Thus, the sequence compar- 
ison of various a proteasonie subunits from archaebacteria to 
mammals shows high homology. 

The bovine and porcine proteases which belong to the cal- 
pain or C2 family of cysteine proteases differ from each other 



in only six amino acid residues and thus show almost 99% 
homology to highly conserved calcium-binding domains and 
the N-terminal glycine-rich hydrophobic region. The region 
rich in proline residues (aa 76 to 81, numbered as in the 
Thermoplasma protease) is also conserved except at position 
79, where proline is replaced by valine. 

Tryptase precursors from humans and dogs (301), which 
belong to the SI or trypsin family of serine proteases, show 
76% sequence identity. The signal sequence from residues 1 to 
30 is 60% identical; the difference is only in the site of glyco- 
sylation, which is Asnl32 in the canine sequence and Asn233 in 
the human sequence. The sequences for active-site and disul- 
fide bond formation are highly conserved and correspond to 
those of chymotrypsinogen (302). 
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The relationship between these neutral proteases is evident 
from the dendrogram shown in Fig. 5b. 

Alkaline Proteases 

The alkaline proteases selected here are active in the pH 
range of 8 to 13 and are about 420 to 480 residues in length. Six 
of them belong to the S8 or subtilase family of serine proteases 
(Table 6). They are aligned in their phylogenetic order, as 
shown in Fig. 4C, Considerable homology within the same 
genus is observed for Bacillus and Aspergillus proteases and 
three other fungal proteases. However, these proteases show 
comparatively lower homology among themselves. The active- 
site residues, as well as the residues around the active site, are 
highly consei-ved, suggesting that they may have evolved from 
a common ancestor. The sequences of E, coli and Cyprinus 



carpio seem to be homologous to some extent, but they do not 
have common active-site residues and therefore do not have a 
common ancestor. These two, in turn, show no significant ho- 
mology to the other seven alkaline proteases. The overall ho- 
mology among all these sequences can be represented by the 
dendrogram in Fig. 5c. 

The results of our analysis of the amino acid sequences of 
the acidic, neutral, and alkaline proteases indicate that the 
members of the pepsin family of acidic proteases may have 
evolved from a common ancestor by convergent evolution. 
High homology between the sequences of the a subunits of 
proteasomes provides evidence for the presence of an evolu- 
tionarily conserved gene family. No amino acid residues are 
conserved in all the acidic or neutral proteases, except glycine. 
The alkaline serine proteases seem to have evolved from a 
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common ancestor by divergent evolution. In general, the se- 
quences belonging to the same family show more homology or 
are more closely related. This criterion is currently used to 
assign a particular sequence to a particular family, i.e., the 
serine protease, cysteine protease, aspartate protease, or met- 
alloprotease family. Within a family, the extent of homology is 
inversely proportional to the phylogenetic distance. The pro- 
teases from distantly related organisms show less homology or 



more diversity. However, this needs extensive sequence anal- 
ysis of proteases, since the homology depends on many param- 
eters or factors such as structure, function, source, and nature 
of the catalytic or active site. Thus, proteases are highly diverse 
enzymes having different active sites and metal-binding re- 
gions. The residues involved in disulfide bond formation and 
their positions, which vary in different proteases, can be de- 
tected by multiple alignments. 
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FIG. 4. Homology alignment of the protease sequences. The protease sequences have been selected from the SWISS-PROT and PIR entries, and some have been 
deduced from the nucleotide sequences obtained from the EMBL database. These are aligned by using CLUSTAL W software for multiple alignment (291). (A) Acidic 
proteases. (B) Neutral proteases. (C) Alkaline proteases. The key to the sequences is given in Table 5. Numbering of the amino acid residues is based on the first 
sequence in the list. Identical (O) and conserved (•) residues are boxed; those involved in the active site are indicated by *. 



CURRENT PROBLEMS AND POTENTIAL SOLUTIONS 

Proteases are a complex group of enzymes which differ in 
their properties such as substrate specificity, active site, and 
catalytic mechanism. Their exquisite specificities provide a ba- 
sis for their numeroas physiological and commercial applica- 
tions. Despite the extensive research on several aspects of 
proteases from ancient times, there are several gaps in our 
knowledge of these enzymes and there is tremendous scope for 
improving their properties to suit projected applications. The 
future lines of development would include (i) genetic ap- 
proaches to generate microbial strains for hyperproduction of 
the enzymes, (ii) application of SDM to design proteases with 
unique specificity and increased resistance to heat and alkaline 
pH, (iii) synthesis of peptides (synzymes) to mimic proteases, 
(iv) production of abzymes (catalytic antibodies) with specific 
protease activity, and (v) understanding of the structure-func- 
tion relationship of the enzymes. Although the section on pro- 
tein engineering describes in detail how SDM has been used to 
alter the properties and functions of proteases of bacterial, 
fungal, and viral origins, some of the important problems faced 
in their desired usages and the possible solutions to overcome 
these hurdles are discussed below. 

Enhancement of Thermostability 

The industrial use of proteases in detergents or for leather 
processing requires that the enzyme be stable at higher tem- 
peratures. One of the common strategies to enhance the ther- 



mostability of the enzyme is to introduce disulfide bonds into 
the protease by SDM. Introduction of a disulfide bond into 
subtilisin E from Bacillus subtilis resulted in an increase of 
4.5°C in the T„, of the mutant enzyme without causing any 
change in its catalytic efficiency (280). However, the properties 
of the mutant enzyme were found to revert to those of the 
wild-type enzyme. Enhanced stability of subtilisin was observed 
as a result of mutations of Asnl.09 and Asn218 to Ser. The 
analog containing both the mutations showed an additive effect 
on thermal stability. Thermostability of the alkaline protease 
horn Aspergillus oryzae is important because of its extensive use 
in the manufacture of soy sauce. The optimal temperature of 
the wild-type enzyme was enhanced from 51 to 56°C by the 
introduction of a disulfide (Cys 169-Cys 200) bond. Another 
strategy for improving the stability of the protease was by 
replacing the polar amino acid groups by hydrophobic groups. 
The presence of positively charged amino acids in the N-ter- 
minal turn of an a-helix is undesirable in view of the possibility 
of an occurrence of the repulsive interactions with the helix 
dipole. Replacement of Lys by Ser or Asp resulted in an in- 
crease in the thermostability of the neutral protease from B. 
subtilis in the range of 0.3 to 1.2°C (54). Although these ap- 
proaches result in an increased stability of proteases, the dif- 
ference in the thermostabilities of the parent and the mutant 
enzymes is only marginal, and further research involving cas- 
sette mutagenesis, etc., is necessary to yield an enzyme with 
substantially enhanced thermostability. 
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TABLE 6. Proteases selected for multiple alignment^' 

SWISS-PROT/PIR No. of amino acid ^ protease Residue(s) at active 

entry residues ^ site 



Acidic proteases 



rYS2 HORVU 


373 


Cl/papain (cysteine) 


CI 58 H297, N318 


HRTO CROAT 


414 


M12B (metallo) 


E3n 




371 


I J*? a*inartic^ 


D89, D258 


ASPP ARDAP 


395 




D96, D258 


TARP PANTR 


390 


Oandidapepsin (aspartic) 


D96' D258 


CARP SACFI 


390 


Saccharopepsin (aspartic) 


D96' D258 


PFPC RAT 


382 


Gastricsin (aspartic) 


D96, D258 


PFP2 MACFIJ 


378 


Pepsin A (aspartic) 


D96, D258 


Neutral proteases 








PRCA__THEAC 


233 


PS 


U 


PRC^ YEAST 


250 


PS 


U 


PRCA SCHPO 


259 


PS 


u 


PRC?~nR YSA 

1 Ixv^^ WIx I vJ/X 


270 


PS 


u 


PUCfi A RATH 


250 


PS 


u 


PRCfi nicni 


250 


PS 


u 


PRCS~CAFFI 


259 


PS 


u 


p 1^ cfTvi R n M F 


249 


PS 


u 


PRC^ YpMl A 


233 


PS 


D 


cans" ROVIN 


263 


C2/calpain (cysteine) 


u 


CANS^PIG 


266 


C2/calpain (cysteine) 


u 


TRYT CANFA 


275 


Sl/trypsin (serine) 


H74, D12I, S191 


TRYB HUMAN 


275 


Sl/trypsin (serine) 


H74, D121, S191 


SNPA_STRLI 


237 


M7 (metallo) 


E64 


Alkaline proteases 








JC6052 


355 


Trypsin-iike protease 


H91, D126, S201 


EYLA BACAO 


380 


S8/subtiiasc 


D120, H150, S302 


SUBT BACST 


381 


S8/subtiiase 


D120, H150. S302 


PRTK TRIAL 


384 


S8/subtiiase 


D120, H150. S302 


ALP TRIHA 


409 


S8/subtilase 


D120, H150, S302 


ALP CEPAC 


402 


SS/subtilase 


0120, HI50, S302 


ORYZ ASPFL 


403 


S8/subtiiase 


D120, HI 50, S302 


ORYZ ASPFU 


403 


S8/subtilase 


D120, HI 50, S302 


150494 


410 


vSerine protease inhibitor 


U 



" Key to the entry niimes of acidic proteases: CYS2_H0RVU, Hordeum vulgare; HRTD_CROAT, Cromlus atrox\ CPR_BACME. Bacillus me};ateriiim\ ASPP_ 
AEDA'E, Acdes aegyptn; CARP^CANTR, Candida tropicalis; CARP_SACF1, Saccharomycopsis fibuligera; PEPC_RAT, Ramts non/egicus; PEP2_MACFU, Macaca 
fuscaia. Sequences are numbered according to the Hordeum vtdgare cysteine protctise. Key to the neutral protease sequences: PRCA_THEAC, Vicimoplasma 
acidophilum; PRC3_YEAST, Saccharomycopsis fihuligera; PRC6_SCHPO, Schizosaccharomyces pomhe\ PRC2_ORYSA, Oryza saliva: PRC6_ARATH. Arahidopsis 
rhaliana; PRC6_DICDI, Dictyostellium discoidcum\ PRC8_CAEEU Oienorhabditis elegans; PRC6_DR0ME, Drosophila melanogaster\ VRC}>J^EH\J^,Xenopus laevis\ 
CANS_BOVIN, Bos iaurus; CANS_PIG, Siis scrofa\ TRYT_CANFA, Canis familiaris; TRYBJhUMAN, Homo sapiens; SNPA_STRLl, Streptomyces lividans. Se- 
quences are numbered according to the Thennoplasma protease. PS, proieasome subunit; U, unknown. Key to the alkaline protease sequences: JC6052, Escherichia 
coli\ ELYA_BACAO. Bacillus amyloliquefaciens] SUBT_BACST, Bacillus suhtilis; PRTK_TRIAL, Tntirachi'um album Limber; ALP_TR1HA, Tritirachium harzianum; 
ALP_CEPAC. Cephalosporium acremonium\ ORYZ^ASPi^L, Aspergillus fiavus; ORYZ_AS?F\J, Aspergillus fumigantr, 150494, Cyprimis carpio. Residues are numbered 
according to the E. coli protease. 



Prevention of Autoproteolytic Inactivation 

Subtilisin, an extensively studied protease, is widely used in 
detergent formulations due to its stability at alkaline pH. How- 
ever, its autolytic digestion presents a major problem for its use 
in industry. It was deduced that there is a correlation between 
the autolytic and conformational stabilities. Computer model- 
ling followed by introduction of a Cys24 or Cys87 mutation 
resulted in destabilization of subtilisin from Bacillus amyloliq- 
uefaciens (312). Introduction of a disulfide bond increased the 
stability of the mutant to a level less than or equal to that of the 
wild-type enzyme. It appears logical that mutations in the 
amino acids involved at the site of autoproteolysis may prevent 
the protease inactivation caused by self-digestion. 

Alteration of pH Optimum 

Different applications of proteases require specific optimal 
pHs for the best performance of the enzyme; e.g., the use of 
proteases in the leather and detergent industries requires an 



enzyme with an alkaline pH optimum, whereas the use in the 
cheesemaking industry requires an acidic protease. Protein 
engineering enables us to tailor the pH dependence of the 
enzyme catalysis to optimize the industrial processes. Modifi- 
cations in the overall surface charge of the proteins are known 
to alter the optimal pH of the enzyme. A change of Asp99 to 
Ser in subtilisin from Bacillus amyloliquefaciens has demon- 
strated the potential of altering the optimal pH of the enzyme 
by systematic multiple mutations on the surface of the protein 
(290), 

Changing of Substrate Specificity 

The properties needed for industrial applications of pro- 
teases differ from their physiological properties. The natural 
substrates of the enzyme are usually different from those de- 
sired for their industrial applications. Despite extensive re- 
search on proteases, relatively little is known about the factors 
that control their specificities toward nonphysiological sub- 
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FIG. 5. Dendrogram showing the relationships among the proteases, created by the Tree View package (213). The proteases are grouped as acidic proteases (a), 
neutral proteases (b), and alkahne proteases (c). Abbreviations of the species described are those used in Table 6. The diflferences between the sequences are 
proportional to the length along the horizontal axis. 



strates. Strategies involving SDM are being explored to tailor 
these specificities at will. A combinatorial random-mutagenesis 
approach has been used to generate mutants that secrete pro- 
teases with functional properties different from those of the 
parent enzyme (77). Introduction of several point mutations 
into the substrate-binding site of a-lytic protease was shown to 
affect its specificity in a predictable manner. The protease 
preferentially cleaves on the C-terminal side of small un- 
charged residues such as Ala, mainly because the pocket that 
accommodates the substrate P-1 residue is shallow due to the 
presence of two bulky methionine residues (Metl90 and 
Met213) at the subsite. Replacement of Met213 with a His 
residue had a beneficial effect on its substrate specificity. 

Improvement of Yield 

The cost of enzyme production is a major obstacle in the 
successful application of proteases in industry. Protease yields 
have been improved by screening for hyperproducing strains 
and/or by optimization of the fermentation medium. Strain 
improvement by either conventional mutagenesis or recombi- 
nant-DNA technology have been useful in improving the pro- 
duction of proteases. Hyperexpression by genetic manipulation 
of microbes is described in the section on genetic engineering. 
Increases in the yield of viral proteases are particularly impor- 
tant for developing therapeutic agents against devastating dis- 
eases such as malaria, cancer, and AIDS. 

There are many major problems in the commercialization of 
proteases. Although they are being addressed by both conven- 
tional and novel methods of genetic manipulation, there are no 
entirely satisfactory solutions and many of these problems re- 
main unanswered. 

FUTURE SCOPE 

Proteases are a unique class of enzymes, since they are of 
immense physiological as well as commercial importance. They 
possess both degradative and synthetic properties. Since pro- 
teases are physiologically necessary, they occur ubiquitously in 
animals, plants, and microbes. However, microbes are a gold- 
mine of proteases and represent the preferred source of en- 



zymes in view of their rapid growth, limited space required for 
cultivation, and ready accessibility to genetic manipulation. 
Microbial proteases have been extensively used in the food, 
dairy and detergent industries since ancient times. There is a 
renewed interest in proteases as targets for developing thera- 
peutic agents against relentlessly spreading fatal diseases such 
as cancer, malaria, and AIDS. Advances in genetic manipula- 
tion of microorganisms by SDM of the cloned gene opens new 
possibilities for the introduction of predesigned changes, re- 
sulting in the production of tailor-made proteases with novel 
and desirable properties. The development of recombinant 
rennin and its commercialization by Pfizer and Genencor is an 
excellent example of the successful application of modern bi- 
ology to biotechnology. The advent of techniques for rapid 
sequencing of cloned DNA has yielded an explosive increase in 
protease sequence information. Analysis of sequences for 
acidic, alkaline, and neutral proteases has provided new in- 
sights into the evolutionary relationships of proteases. 

Despite the systematic application of recombinant technol- 
ogy and protein engineering to alter the properties of pro- 
teases, it has not been possible to obtain microbial proteases 
that are ideal for their biotechnological applications. Industrial 
applications of proteases have posed several problems and 
challenges for their further improvements. The biodiversity 
represents an invaluable resource for biotechnological innova- 
tions and plays an important role in the search for improved 
strains of microorganisms used in the industry. A recent trend 
has involved conducting industrial reactions with enzymes 
reaped from exotic microorganisms that inhabit hot waters, 
freezing Arctic waters, saline waters, or extremely acidic or 
alkaline habitats. The proteases isolated from extremophilic 
organisms are likely to mimic some of the unnatural properties 
of the enzymes that are desirable for their commercial appli- 
cations. Exploitation of biodiversity to provide microorganisms 
that produce proteases well suited for their diverse applica- 
tions is considered to be one of the most promising future 
alternatives. Introduction of extremophilic proteases into in- 
dustrial processes is hampered by the difficulties encountered 
in growing the extremophiles as laboratory cultures. Revolu- 
tionary robotic approaches such as DNA shuffling are being 
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developed to rationalize the use of enzymes from extrenio- 
philes. The existing knowledge about the structure-function 
relationship of proteases, coupled with gene-shuflling tech- 
niques, promises a fair chance of success, in the near future, in 
evolving proteases that were never made in nature and that 
would meet the requirements of the multitude of protease 
applications. 

A century after the pioneering work of Louis Pasteur, the 
science of microbiology has reached its pinnacle. In a relatively 
short time, modern biotechnology has grown dramatically from 
a laboratory curiosity to a commercial activity. Advances in 
microbiology and biotechnology have created a favorable niche 
for the development of proteases and will continue to facilitate 
their applications to provide a sustainable environment for 
mankind and to improve the quality of human life. 
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Abstract 

Two cathepsins were identified in the white skeletal muscle from Carassius 
auratus gibelio. One of these, cathepsin D, has optimal activity at pH 3.5 with 
hemoglobin as substrate and a molecular weight 38,200 Da. The second, cathepsin E, 
is a protein with a molecular weight of 82, 000 Da and an optimum pH 2, 5. Both of the 
enzymes were strongly inhibited by pepstatin, a specific inhibitor for aspartic 
proteinases. 

Keywords: cathepsin D, cathepsin E, Carassius auratus gibelio, pepstatin. 
Introduction 

Proteolytic cleavage of peptide bonds is one the most frequent and important 
enzymatic modifications of proteins. Proteolytic processing is the final step in the expression 
of activity of a great variety of proteins. Processing occurs in many different ways and is 
triggered by many different kinds of proteases. However, in every known case, proteolysis is 
directed and limited to the cleavage of specific peptide bonds in the target protein. The key to 
this selectivity is limited proteolysis, which depends on the accessibility of the scissile peptide 
bond to the processing protease and on its specificity. Compact protein domains are usually 
resistant to proteolysis, in contrast to more flexible surface loops and interdomain regions that 
can adapt themselves to the active site of the protease [1]. 

Proteolytic enzymes are ubiquitously distributed in all biological tissues and fluids. 
The best characterized are mammalian digestive proteases such as pepsin, trypsin, 
chymotrypsin and elastase. The digestive proteases are involved in the hydrolysis of dietary 
proteins and do not play a role in protein turnover within an organism. Much less is known, 
by comparison, about intracellular tissue proteases, their enzymatic specificity and 
physiological substrates [2]. The repertoire of proteases that are integral components of cells 
is enormous and in part unexplored. To mention a few, it includes the entire class of 
lysosomal proteases (cathepsins), membrane-bound proteases and proteases of specialized 
tissues such as the reproductive tracts, muscle, skin, lens and adrenals. 

The purpose of the present study was to investigate the proteolytic enzyme fi'om white 
skeletal muscle of Carassius auratus gibelio. Many kinds of proteolytic enzymes were 
reported in fish. A human-like cathepsin D protease was isolated from the salmon 
{Oncorhynchus masou) ovary [3]. A caspase, which plays an important role in apoptosis in 
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fish cells and zebrafish {Danio rerio) embryo, was reported [4]. From chum salmon 
{Oncorhynchus Keta), an anionic trypsin acting on p- aminophenyl ester, was isolated [5]. In 
rainbow trout matrix was detected a metalloproteinase capable of degrading type I collagen 
[6]. 

Materials and Methods 

Enzyme 's extraction 

After the fish killed, the white skeletal muscle was quickly dissected and rinsed with 
cold saline solution to remove blood. The tissue was suspended in demineralised water and 
was disrupted with a Polytron homogenizer at 0-4°C. The disrupted cells were submitted at a 
subcellular fi-actionation by differential centrifugation. In this procedure the homogenate was 
centriftiged at 600 g for 10 minutes, than the supernatant obtained was centrifiiged for 10 
minutes at 10,000 g. The pellets were suspended in 0,1 M TRIS-HCl buffer, pH 7.4 and 
centrifiiged at 4,000 g. The clear supernatant obtained continents mainly proteins fi"om 
mitochondria and lysosomes and was utilized for investigation. 

Assay of proteolytic activity 

The proteolytic activity (cathepsin activity) was assayed according to Barret [7], using 
acid-denatured hemoglobin as a substrate. A 8% hemoglobin solution was diluted about 3-fold 
with distilled water, then acidified to pH 2.0 with 1 N HCl, and the final concentration of 
hemoglobin was brought to 2% with distilled water. One milliliter of the substrate solution is 
incubated with an appropriate amount of enzyme solution at 37^C. The reaction is stopped by 
addition of 2 ml of 5% trichloroacetic acid. The mixture was filtered and the soluble reaction 
products were treated with Folin reagent and measured at 750nm. 
Units of activity 

There is no international agreement about units obtained by the hemoglobin-digestion 
method. Some investigators, including Anson and Mirsky [8] used the amount of tyrosine- 
equivalent in trichloroacetic acid-soluble peptide. In the current study, one unit of proteolytic 
activity (U) was defined by the amount of enzyme will remove l|imol of tyrosine in a minute 
from the acid-denatured hemoglobin. 
Determination of protein 

Protein concentration was determined using the Bradford method [9], with bovine 
serum albumin as standard, and, during the course of enzyme purification, by measurement of 
A280. 

Testing of potential inhibitors 

Portions of the enzyme solution were mixed with potential inhibitors and incubated for 
30 minutes at 37^C. Mixtures were then incubated with acid-denatured hemoglobin. Reagents 
blanks were also run for each potential inhibitor. The percentage inhibition was determined 
by comparing the activities with those measured for positive controls that contained no 
inhibitor. 

Results and Discussions 

The crude extract of white muscle of Carassius auratus gibelio was fi-actionated with 
acetone. The 40 - 70% acetone precipitate was collected by centrifugation at 15,000 g for 15 
minutes at -5^C. This was redissolved in 9% NaCl and applied on a Bio-Gel P-100 column 
(1.6 X 72 cm) preequilibrated with the same solution. The column was run at 10 ml/h with a 
solution 9% of NaCl. The typical elution profile is shown in Figure 1. Proteolytic activity 
against hemoglobin was detected as two peaks. We designated these two peaks as cathepsin 1 
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and cathepsin II because the cathepsins are the only proteolytic enzymes founded in 
lysosomrs. 




Elution volume (ml) 



Figure 1. Separation of cathepsins by gel -filtration on Bio-Gel P-100. 



The molecular weights of the two proteolytic enzymes were estimated using a 
calibration kit with: aprotin (6,500), citocrom c (12,400), carbonic anhydrase (29,000), 
albumin (66,000) and alcohol dehydrogenase (150,000). The elution volume of cathepsin I 
matched a molecular weight of 82,000 Da. For the cathepsin II, the molecular weight was 
estimated to 38,200 Da (Figure 2). 




Figure 2, Molecular weights of cathepsins. 

The activities of both proteolytic enzymes were higher of acid pH. The cathepsin I 
hydrolyzed hemoglobin most rapidly at around 2.5 and at a significantly lower rate at higher 
pH values such as pH 4.0 and pH 5.0. For the enzyme II the maximal proteolytic activity was 
atpH3.5 (Figure 3). 
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Figure 3. Hydrolytic activities of the cathepsins at different pH values. 

The both proteolytic enzymes had hydrolytic activity on various protein substrates. Protein 
substrates such as hemoglobin, serum albumin, and casein were hydrolyzed efficiently (Table 
1). Hemoglobin was the best protein substrate for both enzymes. 



Table 1. Hydrolyses of proteins substrates by cathepsins I and II. 



Substrate 


Relative activity* (%) 




Cathepsin I 


Cathepsin II 


Hemoglobin 


100 


100 


Bovine serum albumin 


22.3 


52.3 


Ovalbumin 


0 


2,5 


Y-globulin 


5.2 


7.4 


Casein 


30.2 


23.6 



* Activity with hemoglobin as substrate is taken as 1005 in each case. 



Potentials inhibitors of proteolytic activities were tested and the results are presented 
in Table 2. lodoacetic acid and iodoacetic amide, compounds that irreversibly inactivate 
cysteine proteinases, did not significally alter the activity of both cathepsins from white 
skeletal muscle from Carassius auratus gibelio. Diisopropyl phosphofluoridate and 
phenylmethanesulphonyl fluoride, potent inhibitors of serine proteinases had no eflFects on 
cathepsins. The both proteolytic activities were inhibited only by pepstatin, one of the most 
specific inhibitor in enzymology that is highly selective for the aspartic proteinases. 



Table 2. Effect of potential inhibitors 


on cathepsin I and II. 


Compound 


Final concentration 


Inhibition (%) 






Cathepsin I Cathepsin II 


lodoacetic acid 


10 mM 


5.3 4.5 


lodoacetic amide 


10 mM 


4.1 4.6 


Phenylmethanesulphonyl 


1 mM 


6.2 3.4 


fluoride 






Diisopropyl 


1 mM 


2.5 3.5 


phosphofluoridate 






Pepstatin 


\\xU 


98.2 97.5 
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The only intracellular proteinases that may be inhibited by pepstatin are cathepsins D 
(EC 3.4.23.5) and cathepsins E (EC 3.4.23.34 [10]. These two cathepsins are specificity 
similar to pepsin. Hemoglobin is the best protein substrate for these enzymes, and its 
hydrolysis proceeds most rapidly at very low pH values. 

The proprieties of cathepsin I, pH optimum 2.5 and molecular weight 82,000 Da, 
suggest that it is a cathepsin E. Catepsins E hydrolysis hemoglobin most rapidly at around pH 
2.5 at a significantly lower rate at higher pH values such as pH 4.0 and pH 5.0 [1 1] and has a 
molecular weight about 76,000 - 80,000 Da [12]. The cathepsin II seems to be a cathepsin D, 
enzyme with a molecular weight about 38,000 - 50,000 Da and an optimum pH 3.5 [1 1]. 

The properties of cathepsin D isolated from white skeletal muscle of Carassius auratus 
gibelio are similar to those of other fish catepsins. A cathepsin D with a molecular weight of 
38,000-39,000 Da, an optimum pH 2.5 and p 6.8 was purified from herring muscle (Clupea 
harengus), From herring muscle (Clupea harengus) was purified a cathepsin D with a 
molecular weight 38,000-39,000 Da, an optimum pH 2.5 and a pi 6.8. It was inhibited by 
pepstatin and it was able to degrade myosin, actin and tropomyosin [13], Sex- and tissue- 
specific expression of aspartic proteinases was shown in zebrafish (Dario rerio) [14]. 
Antibacterial cathepsins in different types of ambicoloured Japanase flounder skin were 
reported [15]. 
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One of the more powerful methods by which to 
discover the functions of the proteolytic enzymes of 
living tissues dep^ds on the use of specific inhibitors 
of these enzymes. We are successfully using in- 
hibitory antisera in a study of the role of cathepsin 
D in cartilage breakdown (Dingle et al., 1971), but 
effective inhibitors of low molecular weight might 
have important additional applications. Cathepsin 
D is a member of the 'acid proteinase' group of 
enzymes in which the hydrolysis of peptide bonds is 
thought to be catalysed two or more carboxyl 
groups. Other enzymes in this group are another 
tissue proteinase, namely cathepsin E, the gastric 
enzymes p^sm (several forms), gastricsin and rennin, 
and numerous enzymes secreted by moulds and at 
least one bact^um. The add proteinases most 
rapidly attack peptide bonds linking amino acids 
with hydrophobic side chains. Among the few known 
chemical inhibitors of these enzymes the best is 
probably diazoacetylnorleucine methyl ester in the 
presence of Cu^"*" (Sodek & Hofmann, 1968; 
Lundblad & Stein, 1969; Keilovi & Lapresle, 1970), 
but this often reacts slowly or incompletely, is un- 
stable and probably is not highly specific. Certainly, 
it is not suitable for specific inhibition in biological 
systems, in the way that specific antisera to cathepsin 
D can be. The discovery of pepstatin in the culture 
medium of a strain of Streptomyces may therefore 
represent an important new development. Pepstatin 
has the structure isovaleryl-L-valyl-L-valyI-4-amino- 
3 - hydroxy - 6 - methylheptanoyl - l - alanyl - 4 - amino - 
3-hydroxy-6-methylheptanoic add (Morishima et aLy 

1970) , and its discoverers have reported that it 
inhibits pepsin and gastricsin (Umezawa et aL^ 1970; 
Aoyagi et al,^ 1971) and also add proteinase activity 
attributed to cathepsin D in pig liver (Ikezawa et al., 

1971) and rabbit macrophages (Dannenburg et aL^ 

1972) . The purpose of ih& experiments now reported 
was to establish whether this might be a suitable 
agent with which to investigate the biological func- 
tions of the tissue acid proteinases. 

Materials 

Pepstatin sodium salt, a product of the Banyu 
Riarmaceutical Co. Ltd., Tokyo, Japan, was kindly 
given by Dr. A. Dannenberg. Cathepsin D from 



liver of rabbit and man was prepared essentially as 
described by Barrett (1970); for each species an 
unresolved mixture of iso«izymes was used. A specific 
antiserum to rabbit cathepsin D was prepared as 
described by Dingle et at. (1971). Cathepsin Bl was 
isolated from human liver (A. J. Barrett, unpublished 
work). Trypsin (bovine, 2 x crystallized ; Sigma 
Chemical Co., St. Louis, Mo., U.S.A.), a-chymo- 
trypsin (bovine, 3 x crystallized ; Sigma), rennin (from 
calf stomach; Kodi-Light Laboratories, Colnbrook, 
Bucks., U.K.), papain (2 x crystallized; Sigma) and 
pepsin (2x crystallized; Sigma) were purchased. 

Results and discussion 

Specificity of inhibition. The specificity of inhibi- 
tion of proteolytic enzymes by pepstatin (5/ig/ml) 
was examined with trypsin (12,5^tg/ml, with a-iV- 
benzoyl-DL-arginine 2-naphthylamide as substrate at 
pH7.5), chymotrypsin (50/itg/ml, with a-iV-glutaryl- 
DL-phcnylalanine 2-naphthylamide as substrate at 
pH7.5), papain (0.75/xg/ml, with ot-iV-benzoyl-DL- 
arginine 2-naphthylamide as substrate at pH6.0 in 
the presence of 2mM-cysteine and 1 mM-EDTA) and 
cathepsin Bl (lO/xg/ml, under conditions identical 
with those used for papain). Pepsin (6ftg/ml), cath- 
epsin D (both rabbit and human) (8 fcg/ml) and rennin 
(Img/ml) were assayed with haemoglobin as sub- 
strate (Barrett, 1970) at pH3.5, with l^tg of pep- 
statin/ml. 

Pepstatin had no effect on the activities of trypsin, 
chymotrypsin, papain or cathepsin Bl, whereas 
pepsin, cathepsin D (from both species) and rennin 
were totally inhibited. The inhibition of cathepsin D 
was not diminished when the inhibitor had first been 
mixed with a large amount of heat-denatured (100°C, 
5min) cathepsin D. Complete inhibition was also 
produced by pepstatin at If^g/ml (360ng/unit), 
whether or not haemoglobin (20mg/ml) had first 
been mixed with the enzyme, at pH3.5. 

Since cathepsin E was not available to use in pure 
form, its sensitivity to inhibition was determined by 
use of a homogenate of rabbit bone marrow, a tissue 
known to contain cathepsin E (Lapresle, 1970). 
Portions (3.0ml) of the homogenate (1 g of tissue in 
20ml of 1 % NaCl, pH6.8) were treated with sheep 
anti-(rabbit cathepsin D) serum (0-300^1), stored 
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2 3 4 5 
Cathepsin D (units) 

Fig. 1. Effect of increasing cathepsin D concentration 
on the digestion of haemoglobin in the presence of 
pepstatin 

The digestion of haemoglobin was measured as the 
increase in £"280 of trichloroacetic acid-soluble pro- 
ducts in the absence of inhibitor ( o ) or in the presence 
of 50ng (•), lOOng (A) or 200ng (■) of pepstatin. 
Each assay was made in a total volume of 1.0 ml at 
pH3.5 (as described by Barrett, 1970), the enzyme 
and inhibitor being mixed at this pH. The results 
were obtained from two separate experiments. 



overnight at 4°C and centrifuged, to remove cath- 
epsin D. The proteolytic activity of the supernatant 
fluids (0.5 ml) was determined at pH3,5, with haemo- 
globin as substrate, in the presence or absence of 
pepstatin (200ng/ml). It was found that 48% of the 
acid proteinase activity was not prccipitable by the 
antiserum and this was attributed to the immuno- 
logically distinct cathepsin E. The entire acid pro- 
teinase activity of the extract was inhibited by 
pepstatin, however, and it was therefore concluded 
that cathepsin £ is susceptible to inhibition by 
pepstatin. 

The finding that pepstatin had no effect on the 
*serine' or 'cysteine' proteinases, but was a potent 
inhibitor of the four acid proteinases tested, 
confirms and extends the results obtained by Aoyagi 
et a/. (1971). 

Kinetics of inhibition of cathepsin D, A method 
described by Ackerman & Potter (1949) (see also 
Khoo & Russell, 1970) was used to determine 
whether the inhibition of human cathepsin D by 
pepstatin is reversible, *pseudo-irreversible' or 



irreversible (Fig. 1). The results are characteristic of 
pseudo-irreversible or 'tight-binding* inhibition, in 
that the curves obtained in the presence of inhibitor 
tend to parallel the control curve obtained in its 
absence at high enzyme concentrations, but approach 
the origin, rather than intersecting the abscissa, at 
low concentrations of the enzyme. It is shown by 
extrapolation that, in the presence of excess of 
enzyme, 25 ng of pepstatin completely inhibits 1 imit 
of human cathepsin D. On the basis that the specific 
activity of pure human cathepsin D is 600 units/mg 
and its molecular weight 45000 (Barrett, 1970), and 
that the molecular weight of sodium pepstatin is 708 
(from Umezawa et al,, 1970), and assuming that the 
pepstatin preparation was pure, 25 /xg of pepstatin/ 
unit of enzyme represents a molar ratio of 0.95:1.0. 
Equimolar binding of pepstatin to pepsin has been 
reported by Aoyagi et aL (1971). Our results do not 
permit an accurate calculation of the dissociation 
constant, but it appears to be less than 10"® m. This 
finding is comparable with the value of less than 3 x 
10"' M for pepstatin with pepsin (Aoyagi et al^ 1971). 

As would be expected from the extremely small 
dissociation constant, the inhibition of cathepsin D 
by pepstatin is very difficult to reverse experimentally ; 
thus incomplete recovery of enzymic activity was 
achieved during dialysis for several days against a 
large volume of buffer. 

Samples of human cathepsin D without inhibitor, 
or with 15, 30, 60 or 120ng of pepstatin/unit of 
enzyme, were subjected to analytical isoelectric 
focusing in polyacrylamide gel (essentially by the 
method of Barrett, 1970), and the gels were split 
longitudinally so that half could be stained for protein 
and the corresponding profile of enzymic activity 
could be determined by assays made with sections cut 
transversely from the other half. The samples con- 
taining pepstatin showed the same degree of inhibition 
as had been determined before isoelectric focusing, 
and there was no detectable difference in degree of 
inhibition between the several separated isoenzymes. 
No shift in the positions of the bands (attributable 
to the binding of pepstatin, a monobasic acid) was 
detected, even at the highest concentrations of 
inhibitor. 

The antigenic properties of rabbit cathepsin D were 
shown quantitatively to be unaltered by the presence 
of a high concentration of pepstatin, in that the 
inhibitor did not affect the characteristics of binding 
of ^^*I-labelled enzyme to an antibody attached co- 
valently to Sepharose 4B (Dr. Zena Werb, personal 
communication). 

It seems most probable that pepstatin has a very 
high affinity for the active site of acid proteinases. The 
predominantly hydrophobic nature of its side chains 
is compatible with the affinity of the acid proteinases 
for substrates in which the bond to be hydrolysed is 
flanked by non-polar residues. From the data of 
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Aoyagi et ah (1971) it seems that the hydroxy! group 
of the non-terminal residue of 4-amino-3-hydroxy-6- 
methylheptanoic acid is of great functional import- 
ance in the binding of pepstatin, perhaps through 
the formation of a hydrogen bond. 

On the basis of the above results, pepstatin seems 
to be a valuable group specific inhibitor for the acid 
proteinases. It produces complete inhibition of cath- 
epsins D and £ at low concentration, binding in 
equimolar ratio to cathepsin D. In view of the high 
potency and low toxicity (Umezawa et al.^ 1970) of 
pepstatin, it may well be of use in establishing the 
function of tissue acid proteinases in tissue systems. 
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Part II 

PROTEIN CONFORMATION, 
DYNAMICS, AND FUNCTION 



The need for substrate-induced structural changes m the active site 
oflarboxypeptidase A can now be appreciated The bound substrate .s 
surrounded on all sides by catalytic groups of the enzyme. This ar- 
rrrrnent promotes catalysis for the reasons cited above. It is evident 
hat^a^ubstrate could not enter such an array of catalytic groups (nor 
could a product leave) unless the enzyme were flexible. A flexible proinn 
^Zles Tnuch larger repertorre of potently catal,Uc conformnUons thnn does 
a rigid one. 



Space-filling model of chymotrypsin. 
This enzyme is stabilized by several 
disulfide bonds (yellow). 



c.Tc CPFPIFIC MUTAGENESIS IS A POWERFUL METHOD FOR 

Engineering new eSzymes and elucidating mechanisms 

In Aeatha Christie's Murder on the Orierit Express, Hercule Poirof s task as 
1 dttmWe was compounded by the presence of mar^y plaus.ble sus- 
Lr! Likewise the catalytic mechanism of carboxypeptidase A ha 
L n d^ficT to unravel because of the presence of severa potentia 
Stalytic groups at the active site. For example, the hydroxyl group o ; 
V osine 248 was postulated to be the proton donor to the N H group of 
Sstrate in th'e cleavage step. This ^f^^^^^^^ : 
Rntter usine site-specific mutagenesis. If tyrosine 248 is essential tor 
S is a mutant of this enzyme containing phenylalanme in its p ace 
r S hP inactive The tyrosine 248 codon (TAT) was converted to that 
S"leX2e (TTT^ oligonucleotide-directed mutagenesis (p. 
fse^ Th? ecTmbin^^ plaLid containing this gene was inserted into 
a'it Id ex ^ssed. Th'e striking finding was ^^^- ^^^e mutant en^^^^^^^ 
had the same *ea. value as did the native enzyme but value t^^^^^^^^ 

f^M hiahpr These results indicate that tyrosine 248 participates in 
:hf bind^^^^^^^^ Tb traL but is not essential for catalysis. This incisive 
experimem illustrates the power of site-specific mutagenesis m deline- 
ating the function of a particular residue in a protein. 

three-dimensional structure 0^ chymotrypsin. 
A serine protease 

nrotete the sm»« proUases. Trypsin, another digestive "^y">=. 

^^^^ 

^ofcK ch^mot/y^ino^en. The th-e-dimens on^^^ _ o^ 
the enzyme was solved at 2-K resolution (Figure 9-34) by David bU> 
ThP molecule is a compact eUipsoid of dimensions 51 x 40 X 4U 
S m^yps in contains several antiparallel /3-pleated sheet regions and 
Httle aTeUx. All charged groups are on the surface of the molecule 
exceot for three that play a critical role in catalysis. 

The b^logical role of chymotrypsin is to catalyze the hydrolys. o 
prlteLsfn the small intesti/e. Chymotrypsin ^oe-ot c e-^^^^^^^^ 
bonds at a significant rate. Rather, it is selective for peptide bonds 




Figure 9-34 

Three-dimensional structure of a-chymotrypsin. Only the a-carbon atoms are 
shown. Catalytically important residues are marked in color. [Courtesy of Dr. 
David Blow.l 



Chapter 9 
ENZYME MECHANISMS 



the carboxyl side of the aromatic side chains tyrosine, tryptophan, and 
phenylalanine and of large hydrophobic residues such as methionine, Chy- 
motrypsin also hydrolyzes ester bonds. Although unimportant physiolog- 
ically, ester-bond hydrolysis is of interest because of its close relation- 
ship to peptide-bond hydrolysis (Figure 9-35). Indeed, much of our 
knowledge of the catalytic mechanism of chymotrypsin comes from 
studies of the hydrolysis of simple esters. 



Ri— C-N-Rj -h HjO — ^ Ri-C^ + ^H3N-R2 



H 

Peptide 



Acid Amine 



Ri— C—0— Rj + HjO — ^ Ri— C + HO— Rj + 



Ester 



0- 

Acid Alcohol 



Figure 9-35 

Chymotrypsin catalyzes the hydrolysis of peptide and ester bonds. 



PART OF THE SUBSTRATE IS COVALENTLY BOUND 
TO CHYMOTRYPSIN DURING CATALYSIS 

Chymotrypsin catalyzes the hydrolysis of peptide or ester bonds in two 
distinct stages. This was first revealed by studies of the kinetics of hy- 
drolysis of /?-nitrophenyl acetate. When large amounts of enzyme are 
used, there is an initial rapid burst of /?-nitrophenol product, followed by 
its formation at a much slower steady-state rate (Figure 9-36). 




Milliseconds after mixing 
Figure 9-36 

Two phases in the formation of p- 
nitrophenol are evident following the 
mixing of chymotrypsin and />-nitro- 
pheny! acetate. 




NOj 

p-Nitrophenyl Acetyl-enzyme p-Nitrophenol 

acetate intermediate 

Figure 9-37 

Acylation: formation of the acetyl-enzyme intermediate in 
catalysis by chymotrypsin. 



The first step is the combination of p-nitrophenyl acetate with chy- 
motrypsin to form an enzyme-substrate (ES) complex (Figure 9-37). 
The ester bond of this substrate is cleaved. One of the products, p-nitro- 
phenol, is then released from the enzyme, whereas the acetyl group of 
the substrate becomes covalently attached to the enzyme. Water then 
attacks the acetyl-enzyme complex to yield acetate ion and regenerate 
the enzyme (Figure 9-38). The initial rapid burst of /?-nitrophenol pro- 



Enzyme 

I 



Enzyme 



C— CH, 

II 

0 

Acetyl-enzyme 
intermediate 



Acetate 



Figure 9-38 

Deacylation: hydrolysis of the acetyl-enzyme intermediate. 



duction corresponds to the formation of the acetyl-enzyme complex. 
This step is called acylation. The slower steady-state production of p- 
nitrophenol corresponds to the hydrolysis of the acetyl-enzyme com- 
plex to regenerate the free enzyme. This second step, called deacylation, 
is much slower than the first, so that it determines the oveij^ll rate of 
hydrolysis of esters by chymotrypsin. In fact, the acetyl-enzyme com- 
plex is sufficiently stable to be isolated under appropriate conditions. 
The catalytic mechanism of chymotrypsin can thus be represented by 
the adjacent scheme, in which Pi is the amine (or alcohol) component of 
the substrate, E — P2 is the covalent intermediate, and P2 is the acid 
component of the substrate. 

A distinctive feature of this mechanism is the appearance of a cova- 
lent intermediate. In the particular reaction discussed above, an acetyl 
group is covalently bonded to the enzyme. In general, the group at- 
tached to chymotrypsin at the E — P2 stage is an acyl group. Thus, E — P2 
is an acyl-enzyme intermediate. 



THE ACYL GROUP IS ATTACHED TO AN UNUSUALLY REACTIVE 
SERINE RESIDUE ON THE ENZYME 

The site of attachment of the acyl group was identified following the 
isolation of E — P2, which is quite stable at pH 3. The acyl group is 
linked to the oxygen atom of a specific serine residue, namely, serine 



195. This serine residue is unusually reactive. It can be specifically la- 
beled with organic jluorophosphates, such as diisopropylphosphofluor- 
idate (DIPF). DIPF reacts only with serine 195 to form an inactive 
idHsopropylphosphoryl-enzyme complex, which is indefinitely stable. The 
remarkable reactivity of serine 195 is highlighted by the fact that the 
other 27 serine residues in chymotrypsin are untouched by DIPF, 

Chymotrypsin is not the only enzyme to be inactivated by DIPF. 
Numerous other proteolytic enzymes, such as trypsin, elastase, throm- 
bin, subtilisin, and acetylcholinesterase react specifically with DIPF and 
i^are inacdvated. The reacdon takes place at a unique serine residue, as 
in chymotrypsin. Hence, these enzymes are called the serine proteases. 



DEMONSTRATION OF THE CATALYTIC ROLE OF HISTIDINE 57 
BY AFFINITY LABELING 

The importance of a second residue in catalysis was shown by affinity 
labeling. The strategy was to react chymotrypsin with a molecule that (1) 
specifically binds to the active site because it resembles a substrate and 
then (2) forms a stable covalent bond with a group on the enzyme that is 
in close proximity. These criteria are met by tosyl-L-phenylalanine chlo- 
romethyl ketone (TPCK), whose structure is shown in Figure 9-39, The 
phenylalanine side chain of TPCK enables it to bind specifically to chy- 
motrypsin. The reactive group in TPCK is the chloromethyl ketone 
function, which alkylates one of the ring nitrogens of histidine 57. 
TPCK is positioned to react with this residue because of its specific 
binding to the active site of the enzyme. The TPCK derivative of chy- 
motrypsin is enzymatically inactive. 

Three lines of evidence indicated that histidine 57 is part of the active 
site. First, the affinity-labeling reaction was highly stereospecific; the 
D-isomer of TPCK was totally ineffective. Second, the reaction was 
inhibited when a competidve inhibitor of chymotrypsin, (3- 
phenylpropionate, was present. Third, the rate of inacUvation by 
TPCK varied with pH in nearly the same way as did the rate of catalysis. 
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Figure 9-39 

Affniiiy labeling of a histidine res- 
idue in chymotrypsin by tosyl-L- 
phcnylalanine chloromethyl ke- 
tone ( TPCK), a reactive substrate 
analog. 



SERINE, HISTIDINE, AND ASPARTATE 

FORM A CATALYTIC TRIAD IN CHYMOTRYPSIN 

The catalytic activity ot chymotrypsin depends on the luiusual proper- 
ties ot serine 195. A — CH^OH group is ordinarily quite unreactive 
under physiological conditions. What makes it so reactive in the active 
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Figure 9-40 

Conformation of the serine- 
histidine-aspartate catalytic triad in 
chymotrypsin. [After D. M. Blow 
and T. A. Steitz. X-ray diffraction 
studies of enzymes. Ann. Rev, Bio- 
chem. 39(1 970): 86. Copyright © 1970 
by Annual Reviews Inc. All rights 
reserved.] 



site of chymotrypsin? A convincing explanation has emerged fr^ 
x-ray studies of the three-dimensional structure of the enzyme. As m 
foreseen by affinity-labeling studies, histidine 57 is adjacent to seril 
195 (Figure 9-40). The carboxylate group of aspartate 102, buried^ 
the protein, also is next to histidine 57. These three residues forirf^ 
catalytic triad. d 
In the absence of substrate, histidine 57 is unprotonated (Figa 
9-41). However, it is poised to accept the proton from the serine If 
— OH group when this oxygen atom carries out a nucleophiUc attack^ 
the substrate. It was thought that aspartate 102 in turn became protoS 
ated, and so this triad became known as a charge relay network. HoS 
ever, neutron diffraction studies have shown that the proton stays ^ 
histidine 57 and that aspartate 102 remains negatively charged. |^ 
stead, the role of the — COO" group of aspartate 102 is to stabilize i 
positively charged form of histidine 57 in the transition state. In adi 
tion, aspartate 102 orients histidine 57 and insures that it is in the ; ^ 
propriate tautomeric form to accept a proton from serine 195. 
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Figure 9-41 

Role of the catalytic triad in chymotrypsin: 
(A) enzyme alone; (B) on addition of a sub- 
strate, a proton is transferred from serine 
195 to histidine 57. The positively charged 
imidazole ring is stabilized by electrostatic in- 
teraction with negatively charged aspartate 
102. 
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Figure 9-42 

Schematic representation of the 
binding of formyl-L-tryptophan, a 
substrate analog, to chymotrypsin. 



Crystallographic studies of complexes of chymotrypsin with substf 
analogs have also shown the location of the site of specific recogniuj 
and the likely orientation of the susceptible peptide bond. Formyl| 
tryptophan binds to chymotrypsin with its indole side chain fitU 
neatly into a pocket near serine 195 (Figure 9-42). This deep cleftis 
counts for the specificity of chymotrypsin for aromatic and other buffl 
hydrophobic side chains. Crystallographic analyses of complexes" 
chymotrypsin aTid polypeptide substrate analogs show that the m| 
chain of the enzyme and that of the substrate are hydrogen bonde^ 
each other as in an antiparallel j3 pleated sheet. 



A TRANSIENT TETRAHEDRAL INTERMEDIATE 
IS FORMED DURING CATALYSIS BY CHYMOTRYPSIN 

A plausible catalytic mechanism for chymotrypsin has been dedu| 
from extensive x-ray crystallographic and chemical data. In this mecf 
nism, histidine 57 and serine 195 participate directly in the cleavage ofr 
susceptible peptide bond of the substrate. Hydrolysis of the peptide 
starts with an attack by the oxygen atom of the hydroxyl group of se| 



6 on the carbonyl carbon atom of the susceptible peptide bond. The 
bon-oxygen bond of this carbonyl group becomes a single bond, 
9 the oxygen atom acquires a net negative charge. The four atoms 
y/ bonded to the carbonyl carbon are arranged as in a tetrahedron. 
*e formation of this transient tetrahedral intermediate from a planar 
ide group is made possible by hydrogen bonds between the nega- 
ely charged carbonyl oxygen atom (called an oxyanion) and two main- 
ain NH groups (Figure 9-43). 

iThe other essential event in the formation of this tetrahedral transi- 
n state is the transfer of a proton from serine 195 to histidine 57 
igure 9-44). This proton transfer is markedly facilitated by the pres- 
ipe of the catalytic triad. Aspartate 102 precisely orients the imidazole 
ng of histidine 57 and pardy neutralizes the charge that develops on it 
gring the transition state. The proton held by the protonated form of 
-|tidine 57 is then donated to the nitrogen atom of the susceptible 
ptide bond, which thus is cleaved. At this stage, the amine compo- 
gnt is hydrogen bonded to hisudine 57, whereas the acid component 
I the substrate is esterified to serine 195. The amine component dif- 
uses away, completing the acylation stage of the hydrolytic reaction. 
The next stage, deacylation (Figure 9-45), begins when a water mole- 
Je takes the place occupied earlier by the amine component of the 
ubstrate. In essence, deacylation is the reverse of acylation, with H2O substi- 
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Figure 9-43 

The tetrahedral transition state in 
the acylation reaction of chymotryp- 
sin. The hydrogen bonds formed by 
two NH groups from the main chain 
of the enzyme are critical in stabiliz- 
ing this species. This site is called the 
oxyanion hole. 
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Figure 9-44 

First stage in the hydrolysis of a pep- 
tide by chymotrypsin: acylation. A tetra- 
hedral transition state is formed, in 
which the peptide bond is cleaved. The 
amine component then rapidly diffuses 
away, leaving an acyl-enzyme intermedi- 
ate. 
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figure 9-45 

Second stage in the hydrolysis of a pep- 
tide by chymotrypsin: deacylation. The 
acyl-enzyme intermediate is hydrolyzed 
by water. Note that deacylation is essen- 
tially the reverse of acylation, with 
water in the role of the amine compo- 
nent of the original substrate. 
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A Chymotrypsin 




B Elastase 
Figure 9-46 

Comparison ot the conformation of 
the main chains of (A) chymotrypsin 
and (B) elastase. The locations of the 
catalytic triad (residues 102, 57, and 
195) are shown in color. [Alter B. S. 
Hartley and D. M. Shotton. In The 
Enzymes, P. D. Boyer, cd., 3rd ed., 
vol. 3 (Academic Press, 1971), p. 
362.] 



tulmgfor the amine cumponenl. First, the charge relay network draws a1 
proton away from water. The resulting OH ion immediately attacks'^ 
the carbonyl carbon atom of the acyl group that is attached to serine;: 
195 As in acylation, a transient letrahedrai intermediate is formed. ; 
Histidine 57 then donates a proton to the oxygen atom of serine 195,- 
which then releases the acid component of the substrate. This acid com- 
ponent diffuses away and the enzyme is ready for another round of 
catalysis. 



TRYPSIN AND ELASTASE: 

VARIATIONS ON THE CHYMOTRYPSIN THEME 

Trypsin and elastase, two other digestive enzymes, are like chymotryp. 
sin in several respects: (I) About 40% of the amino acid sequences of 
these three enzymes :.i identical. The degree of identity is even higher 
(-60%) for residues located in the interior of the enzymes. (2) X-ray 
studies have shown that their three-dimensional structures are very 
similar (Figure 9-46). (3) A serine-hisiidine-aspartaie catalytic triad is. 
present in all three. (4) The serine residue in this triad is modified by 
tluorophosphates (such as DIPF), causing a loss of catalytic activity. The; 
amino acid sequence around this serine is the same in all three en-zymes , 
Gly-Asp-Ser-Gly-Cly-Pro. (5) All three enzymes have nearly identical 
caulvtic mechanisms. The catalytic triad and oxyanion hole m each 
promotes the formation ol' a tetrahedral transition state. A coya ent 
acyl-enzyme intermediate is formed by all three during catalysis. (6) As 
will be discussed in the next chapter, these enzymes are .secreted by thes 
pancreas as inactive precursors that become activated by cleavage of 
single peptide bond. 

Although similar in structure and mechanism, these enzymes differ; 
markedly in substrate specificity. Chymotrypsin requires an aromatic or 
bulky nonpolar side chain on the amino side of the scissile bond. Tryp-: 
sin reciuires a Ivsine or argininc residue. Elastase cannot cleave either of 
these kinds of substrates. Its specificity is directed toward the sinaUeri 
uncharged side chains. X-ray studies have shown that these different spec-^^ 
ificiltes are due lo quite small structural differences m the binding site (Figurej 
9-47) In chymotrypsin, a nonpolar pocket serves as a niche tor the, 
aromatic or bulky nonpolar side chain. In trypsin, one residue in this^ 
pocket is different from chymotrypsin: a serine is replaced by an aspar-, 
rate This aspartate in the nonpolar pocket of trypsin can lorm a strong; 
electrostatic bond with a positively charged lysine or arginine side chaia 
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A highly simplified representation of part of the substrate-hmdn.g site ni 

chymotrypsin, trypsin, and elastase. 
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